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Introduction: 

Primary  inflammatory  breast  eaneer  (IBC)  aeeounts  for  approximately  3%  of  new  breast  eaneers  in  the 
US.  This  form  of  loeally  advaneed  breast  eaneer  is  eharaeterized  elinieally  by  erythema,  warmth,  and 
dimpling  of  the  skin  that  arise  rapidly,  typieally  within  six  months.  IBC  is  generally  not  assoeiated  with 
preeursor  lesions  and  is  rapidly  invasive  from  the  outset,  espeeially  to  the  skin  and  lymphaties,  and  is 
highly  angiogenie  and  metastatie.  Beeause  of  this  disease’s  rapid  progression,  the  effeetiveness  of 
aggressive  multimodality  treatment  is  limited;  the  5 -year  disease-free,  mean  survival  rate  is  less  than 
45%,  making  IBC  the  most  lethal  form  of  breast  eaneer  (1).  This  rapid  progression  is  due  to  the 
development  of  distant  metastases,  indieating  that  the  tumors  quiekly  aequire  the  ability  to  invade  and 
metastasize  during  tumor  development.  This  suggests  that  the  unique  aggressive  inflammatory 
phenotype  of  IBC  is  the  result  of  a  limited  number  of  eoneordant  genetie  alterations.  As  sueh,  IBC 
eonstitutes  an  exeellent  paradigm  to  understand  aggressive  phenotypes  in  breast  eaneer.  Previously,  our 
laboratory  has  found  eoneordant  and  eonsistent  overexpression  and  of  RhoC  GTPase  in  tissue  samples 
from  patients  with  IBC  as  eompared  to  stage-matehed  non-IBC  (2,  3).  We  have  also  demonstrated  that 
RhoC  GTPase  oeeupies  an  integral  role  in  the  aggressive  phenotype  of  IBC  (4,  5).  With  the  inereasing 
evidenee  that  RhoC  and  other  ras-homology  family  proteins  play  a  signifieant  role  in  other  eaneers  (6, 
7),  the  therapeutie  importanee  of  inhibiting  RhoC  aetivity  is  elear,  highlighting  the  erueial  need  to 
uneover  the  the  moleeular  meehanisms  leading  to  RhoC-driven  metastatie  phenotype  of  IBC.  In  spite  of 
this  need,  however,  a  model  explaining  the  meehanisms  of  RhoC  overexpression  in  breast  eaneer  does 
not  exist.  The  goal  of  this  award  is  to  establish  sueh  a  model.  Our  central  hypothesis  was  that 
overexpression  of  RhoC  GTPase  in  metastatic  breast  cancer  is  due  to  gene  amplification,  epigenetic 
deregulation,  transcription  factor  deregulation,  and/or  enhanced  or  differential  mRNA  stability. 
Because  of  these  cellular  and  molecular  alterations,  early  stage  IBC  is  subject  to  rapid  metastatic 
spread  through  downstream  effectors  signaling  for  invasion  and  angiogenesis. 

Body: 

As  I  matrieulated  through  graduate  sehool,  I  originally  thought  that  I  was  going  to  work  full  time  in  Dr. 
Merajver’s  lab  (first  graduate  sehool  rotation)  studying  IBC.  However,  as  I  was  granted  this  award  I  was 
also  ehoosing  to  transfer  into  Dr.  Arul  Chinnaiyan’s  lab.  Fortunately,  I  was  granted  permission  by  the 
DOD  to  transfer  the  award  to  follow  me  to  eontinue  working  on  this  projeet  from  Dr.  Chinnaiyan’s  lab. 
Beeause  of  this,  we  have  been  able  to  establish  an  effeetive  and  highly  eollaborative  meeting  with  Dr. 
Merajver  where  I  attend  her  bi-weekly  lab  meetings  and  work  with  a  teehnieian  in  her  lab  to  help 
eomplete  this  projeet.  This  has  given  me  a  lot  of  unique  experienees.  For  example,  learning  how  to 
ereate  well  defined  experimental  protoeols  and  making  sure  that  she  has  the  appropriate  materials  and 
eontrols  to  exeeute  eaeh  experiment.  As  sueh,  eontinuing  this  DOD  pre-doetoral  grant  has  given  me  the 
opportunity  to  eontinue  existing  eollaborations  and  to  eontninue  improving  my  leadership  skills  through 
working  with  a  teehnieian  on  a  daily  basis. 

In  addition  to  working  with  a  teehnieian,  I  have  also  had  the  opportunity  to  train  three  undergraduate 
students  through  the  University  of  Miehigan  Undergraduate  Researeh  Opportunities  Program  and  one 
Master’s  degree  student.  The  students  have  learned  several  different  protoeols  ineluding  PCR,  restrietion 
digests.  Gateway  eloning,  DNA  miniprep,  DNA  maxiprep,  RNA  isolation,  oDNA  synthesis,  qRT-PCR, 
PCR,  Western  blotting,  transfeetions  of  both  large  DNA  veetors  and  siRNA  into  mammalian  eells,  eell 
eulture,  produetion  of  lentivirus  and  lentiviral  transduetion,  eell  invasion  assays,  eell  growth  assays  and 
propidium  iodide  staining.  Additionally,  I  have  led  a  bi-weekly  eaneer  biology  journal  elub  meeting  with 
all  of  the  students  in  our  lab  (22  undergraduates).  At  the  end  of  eaeh  semester,  I  help  the  students 
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eompile  their  results  to  present  at  a  lab  meeting  and  at  an  Undergraduate  researeh  forum  by  both  poster 
presentation  and  leeture.  Importantly,  one  of  the  students  that  I  have  been  training  was  awarded  an  NIH 
summer  fellowship  that  funded  her  to  work  in  the  lab  for  the  entire  summer. 

While  I  have  found  the  opportunity  to  teaeh  students  standard  teehniques  on  a  daily  basis  and  have 
found  reward  in  the  sueeesses  that  they  have  experieneed,  I  have  also  been  able  to  learn  several  new 
experimental  teehniques  that  I  would  not  otherwise  have  had  the  opportunity  to  learn  without  this 
training  grant  ineluding  Solexa  high  throughput  Transeriptome  sequeneing,  Fluoreseenee  in  situ 
hybridization  as  well  as  running  aCGH  and  mieroRNA  arrays.  Perhaps  more  interesting  is  the  analysis 
algorithms  that  I  am  helping  to  develop,  ineluding  those  used  to  identify  novel  gene  fusions  from  paired 
end  sequeneing  data  (8),  for  the  analysis  of  my  global  profiling  data  from  these  IBC  eell  line  samples. 
While  little  is  known  about  the  moleeular  origins  of  inflammatory  breast  eaneer,  we  have  made 
signifieant  advanees  not  only  in  the  aequisition  of  large  profiling  data  sets  of  DNA  eopy  number, 
mieroRNA  expression  and  transeriptome  sequeneing,  but  also  in  software  development  to  analyze  this 
data.  Currently,  we  are  in  the  proeess  of  eompleting  an  integrated  analysis  from  all  three  profiling 
platforms.  Additionally,  we  have  unexpeetedly  found  that  the  two  IBC  eell  lines  SUM  149  and  SUM  190 
have  an  extra  eopy  of  ehromosome  1 .  Beeause  several  other  stage  matehed  breast  eaneer  eell  lines  do 
not  have  this  extra  eopy  of  ehromosome  1 ,  we  are  exploring  the  oeeurrenee  of  ehromosome  1 
amplifieation  in  IBC  elinieal  samples.  The  signifieanee  of  this  finding  is  still  unelear,  but  will  be 
explored  in  more  detail  if  a  elinieal  eorrelation  is  observed. 

The  opportunity  to  work  on  developing  novel  teehniques  and  protoeols  for  this  projeet  has  led  direetly  to 
opportunities  to  improve  my  eommunieation  and  professional  skills.  Within  the  last  year,  I  have 
presented  some  of  the  work  at  the  Ameriean  Assoeiation  for  Caneer  Researeh  Meeting  in  Denver, 
Colorado  (April  2009).  At  that  meeting,  I  was  a  eo-author  on  three  posters  on  both  the  role  of  RhoC 
GTPases  in  IBC  and  other  breast  eaneers  as  well  as  eo-author  on  an  abstraet  presented  by  podium 
presentation.  Additionally,  I  reeeived  a  seholarship  to  attend  a  keystone  eonferenee  in  Vietoria,  British 
Columbia  and  reeeived  a  nomination  to  beeome  an  Ameriean  Assoeiation  of  Caneer  Researeh  Assoeiate 
eouneil  member. 

Key  Research  Accomplishments: 

Specific  Aim  1;  To  delineate  if  and  how  gene  amplification  in  RhoC  GTPase  occurs  in  breast  cancer 
and  to  identify  novel  gene  fusions  in  inflammatory  breast  cancer. 

•  Completed  RhoC  FISH  and  diseovered  that  IBC  eell  lines  do  not  have  amplifieation  of  the  RhoC 
loeus,  but  earry  an  extra  eopy  of  ehromosome  1 . 

•  Aequired  244k  Agilent  aCGH  data  for  several  eell  lines  ineluding  the  two  IBC  eell  lines, 

SUM149  and  SUM190. 

•  Completed  the  Illumina  bead  station  mieroRNA  profiling  ehip  V2  of  eell  line  panel  ineluding 
HME,  MCFIOA,  SUM149,  SUM190,  MDA-MB-231,  HCC1937  and  BT20. 

•  Sequeneed  the  RNA  transeriptome  of  both  SUM  149  and  SUM  190  using  massively  parallel,  high 
throughput  paired-end  sequeneing  on  a  SOLEXA  GA2  from  Illumina. 

Specific  Aim2:  To  determine  how  DNA  methylation  status  and  histone  modifications  regulate  the 
RhoC  GTPase  promoter,  and  to  assess  the  ability  of  the  small  molecule  drugs  5-azacytidine  and 
Trichostatin  A  to  alter  the  metastatic  phenotype  depicted  by  an  IBC  cell  line  model 


2 


Brenner:  BC083217 


•  Completed  Illumina  bead  station  mieroRNA  profiling  ehip  V2  of  eell  line  panel  ineluding  HME, 
MCFIOA,  SUM149,  SUM190,  MDA-MB-231,  HCC1937  andBT20  treated  with  5-azaeytidine 
or  Triehostatin  A. 

•  Prepared  RNA  transeriptome  libraries  of  both  SUM  149  and  SUM  190  treated  5-azaeytidine  or 
Triehostatin  A  for  sequeneing  on  an  Illumina  SOLEXA  GA2. 

•  Treatment  of  MCFIOA  and  HME  eells  with  either  5-azaeytidme  or  Triehostatin  A  revealed  no 
signifieant  inerease  in  RhoC  mRNA  expression  suggesting  that  the  moleeular  mechanism 
leading  to  RhoC  overexpression  does  not  involve  the  activation  of  genes  repressed  by  either 
methylation  or  deacetylation. 

Specific  Aim3:  To  characterize  the  consequences  of  down  regulating  the  expression  of  the  transcription 
factors  FoxP3,  HoxA3,  HoxB7,  HoxB8,  HoxD9,  HoxDlO,  CREB  and  NFkBI,  all  of  which  contain 
highly  conserved  binding  sites  in  the  putative  RhoC  GTPase  promoter,  on  molecular  pathways 
regulating  cell  proliferation,  survival  and  the  metastatic  phenotype,  using  an  RNAi  model  system  of 
human  IBC  cell  lines. 

•  Established  stable  shRNA  knockdown  cell  lines  for  FoxP3,  HoxA3,  HoxB7,  HoxB8,  HoxD9, 
HoxDlO,  CREB  and  NFkBI  in  SUM149  cells. 

•  Identifiied  NFkBI  as  a  key  regulator  of  RhoC  mRNA  and  protein  expression  in  SUM  149  and 
SUM  190  cells. 

•  Established  a  4.0kbp  RhoC  promoter  reporter  system. 

•  Developed  site  mutants  of  RhoC  promoter  reporter  system. 

•  Completed  chromatin  immunoprecipitation  assays  that  demonstrated  enhanced  NFkBI  binding 
at  2/3  putative  NFkBI  binding  sites  in  the  RhoC  promoter. 

Specific  Aim4;  To  determine  the  distribution  and  stability  of  RhoC  GTPase  transcription  variants  in 
altering  the  half-life  of  the  different  mRNAs,  thereby,  regulating  the  total  RhoC  GTPase  protein 
expression. 

•  Established  RhoC  and  GAPDH  probes  for  northern  blot  analysis. 


Reportable  outcomes: 

•  Established  stable  shRNA  knockdown  cell  lines  for  FoxP3,  HoxA3,  HoxB7,  HoxB8,  HoxD9, 
HoxDlO,  CREB  and  NFkBI  in  SUM  149  cells. 

•  Developed  a  4.0kbp  RhoC  promoter  reporter 

•  Published  a  review  titled,  “Translocations  in  epithelial  cancers.”  (9) 

•  Published  a  manuscript  detailing  the  methodology  for  identification  of  gene  fusions  in  epithelial 
cancers,  “Chimeric  transcript  discovery  by  paired-end  transeriptome  sequencing.”  (8) 

•  A  manuscript  wasaccepted  for  publication  at  Mol.  Cancer  Res,  “RhoC  Expression  and  Head  and 
Neck  Cancer  Metastasis”  {In  Press) 

•  Completed  abook  chapter  that  was  accepted  for  publication,  “The  Rho  GTPases  in  Cancer”  {In 
Press) 
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Conclusions: 

Sinee  the  submission  of  the  original  applieation  and  initiation  of  the  DOD  breast  eaneer  training 
program,  I  have  eompleted  the  eore  eourses  in  Geneties,  Bioehemistry,  Cell  Biology  and  Ethies  required 
by  the  University’s  CMB  program  as  well  as  eomprehensive  eourses  in  Caneer  Biology,  Pharmaeology, 
Proteomies,  Bioinformaties  of  Sequenee  Alignment  and  Mathematieal  Models  in  Biology.  I  have 
eompleted  a  eomprehensive  preliminary  exam  on  a  subjeet  unrelated  to  this  DOD  award  (my  thesis 
projeet)  as  required  by  the  CMB  program.  Additionally,  I  have  been  first  author  or  eo-author  on  three 
manuseripts  and  one  book  ehapter  aeeepted  for  publieation  on  work  direetly  disseminating  from  this 
DOD  Breast  eaneer  award. 
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Recurrent  gene  fusions  are  a  prevalent  class  of  mutations  arising  from 
the  juxtaposition  of  2  distinct  regions,  which  can  generate  novel 
functional  transcripts  that  couid  serve  as  valuable  therapeutic  targets 
in  cancer.  Therefore,  we  aim  to  estabiish  a  sensitive,  high-throughput 
methodology  to  comprehensively  catalog  functional  gene  fusions  in 
cancer  by  evaluating  a  paired-end  transcriptome  sequencing  strategy. 
Not  only  did  a  paired-end  approach  provide  a  greater  dynamic  range 
in  comparison  with  single  read  based  approaches,  but  it  clearly 
distinguished  the  high-level  "driving"  gene  fusions,  such  as  BCR-ABL1 
and  TMPRSS2-ERG,  from  potential  lower  level  "passenger"  gene 
fusions.  Also,  the  comprehensiveness  of  a  paired-end  approach  en¬ 
abled  the  discovery  of  12  previously  undescribed  gene  fusions  in  4 
commonly  used  cell  lines  that  eluded  previous  approaches.  Using  the 
paired-end  transcriptome  sequencing  approach,  we  observed  read- 
through  mRNA  chimeras,  tissue-type  restricted  chimeras,  converging 
transcripts,  diverging  transcripts,  and  overlapping  mRNA  transcripts. 
Last,  we  successfully  used  paired-end  transcriptome  sequencing  to 
detect  previously  undescribed  ETS  gene  fusions  in  prostate  tumors. 
Together,  this  study  establishes  a  highly  specific  and  sensitive  ap¬ 
proach  for  accurately  and  comprehensively  cataloguing  chimeras 
within  a  sample  using  paired-end  transcriptome  sequencing. 

bioinformatics  |  gene  fusions  |  prostate  cancer  |  breast  cancer  |  RNA-Seq 

One  of  the  most  common  classes  of  genetic  alterations  is  gene 
fusions,  resulting  from  chromosomal  rearrangements  (1). 
Intriguingly,  >80%  of  all  known  gene  fusions  are  attributed  to 
leukemias,  lymphomas,  and  bone  and  soft  tissue  sarcomas  that 
account  for  only  10%  of  all  human  cancers.  In  contrast,  common 
epithelial  cancers,  which  account  for  80%  of  cancer-related  deaths, 
can  only  be  attributed  to  10%  of  known  recurrent  gene  fusions 
(2-4).  However,  the  recent  discovery  of  a  recurrent  gene  fusion, 
TMPRSS2-ERG,  in  a  majority  of  prostate  cancers  (5,  6),  and 
EML4-ALK  in  non-small-cell  lung  cancer  (NSCLC)  (7),  has  ex¬ 
panded  the  realm  of  gene  fusions  as  an  oncogenic  mechanism  in 
common  solid  cancers.  Also,  the  restricted  expression  of  gene 
fusions  to  cancer  cells  makes  them  desirable  therapeutic  targets. 
One  successful  example  is  imatinib  mesylate,  or  Gleevec,  that 
targets  BCR-ABLl  in  chronic  myeloid  leukemia  (CML)  (8-10). 
Therefore,  the  identification  of  novel  gene  fusions  in  a  broad  range 
of  cancers  is  of  enormous  therapeutic  significance. 

The  lack  of  known  gene  fusions  in  epithelial  cancers  has  been 
attributed  to  their  clonal  heterogeneity  and  to  the  technical  limi¬ 
tations  of  cytogenetic  analysis,  spectral  karyotyping,  FISH,  and 
microarray-based  comparative  genomic  hybridization  (aCGH).  Not 
surprisingly,  TMPRSS2-ERG  was  discovered  by  circumventing 
these  limitations  through  bioinformatics  analysis  of  gene  expression 
data  to  nominate  genes  with  marked  overexpression,  or  outliers,  a 
signature  of  a  fusion  event  (6).  Building  on  this  success,  more  recent 
strategies  have  adopted  unbiased  high-throughput  approaches,  with 
increased  resolution,  for  genome-wide  detection  of  chromosomal 
rearrangements  in  cancer  involving  BAG  end  sequencing  (11), 
fosmid  paired-end  sequences  (12),  serial  analysis  of  gene  expression 


(SAGE)-like  sequencing  (13),  and  next-generation  DNA  sequenc¬ 
ing  (14).  Despite  unveiling  many  novel  genomic  rearrangements, 
solid  tumors  accumulate  multiple  nonspecific  aberrations  through¬ 
out  tumor  progression;  thus,  making  causal  and  driver  aberrations 
indistinguishable  from  secondary  and  insignificant  mutations, 
respectively. 

The  deep  unbiased  view  of  a  cancer  cell  enabled  by  massively 
parallel  transcriptome  sequencing  has  greatly  facilitated  gene  fu¬ 
sion  discovery.  As  shown  in  our  previous  work,  integrating  long  and 
short  read  transcriptome  sequencing  technologies  was  an  effective 
approach  for  enriching  “expressed”  fusion  transcripts  (15).  How¬ 
ever,  despite  the  success  of  this  methodology,  it  required  substantial 
overhead  to  leverage  2  sequencing  platforms.  Therefore,  in  this 
study,  we  adopted  a  single  platform  paired-end  strategy  to  com¬ 
prehensively  elucidate  novel  chimeric  events  in  cancer  transcrip- 
tomes.  Not  only  was  using  this  single  platform  more  economical,  but 
it  allowed  us  to  more  comprehensively  map  chimeric  mRNA,  hone 
in  on  driver  gene  fusion  products  due  to  its  quantitative  nature,  and 
observe  rare  classes  of  transcripts  that  were  overlapping,  diverging, 
or  converging. 

Results 

Chimera  Discovery  via  Paired-End  Transcriptome  Sequencing.  Here, 
we  employ  transcriptome  sequencing  to  restrict  chimera  nomina¬ 
tions  to  “expressed  sequences,”  thus,  enriching  for  potentially 
functional  mutations.  To  evaluate  massively  parallel  paired-end 
transcriptome  sequencing  to  identify  novel  gene  fusions,  we  gen¬ 
erated  cDNA  libraries  from  the  prostate  cancer  cell  line  VCaP, 
CML  cell  line  K562,  universal  human  reference  total  RNA  (UHR; 
Stratagene),  and  human  brain  reference  (HBR)  total  RNA  (Am- 
bion).  Using  the  Illumina  Genome  Analyzer  II,  we  generated  16.9 
million  VCaP,  20.7  million  K562,  25.5  million  UHR,  and  23.6 
million  HBR  transcriptome  mate  pairs  (2  X  50  nt).  The  mate  pairs 
were  mapped  against  the  transcriptome  and  categorized  as  (i) 
mapping  to  same  gene,  (ii)  mapping  to  different  genes  (chimera 
candidates),  (Hi)  nonmapping,  (iv)  mitochondrial,  (v)  quality  con¬ 
trol,  or  (yi)  ribosomal  (Table  SI).  Overall,  the  chimera  candidates 
represent  a  minor  fraction  of  the  mate  pairs,  comprising  «=<!%  of 
the  reads  for  each  sample. 

We  believe  that  a  paired-end  strategy  offers  multiple  advantages 
over  single  read  based  approaches  such  as  alleviating  the  reliance 
on  sequencing  the  reads  traversing  the  fusion  junction,  increased 
coverage  provided  by  sequencing  reads  from  the  ends  of  a  tran- 
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scribed  fragment,  and  the  ability  to  resolve  ambiguous  mappings 
(Fig.  SI).  Therefore,  to  nominate  chimeras,  we  leveraged  each  of 
these  aspects  in  our  bioinformatics  analysis.  We  focused  on  both 
mate  pairs  encompassing  and/or  spanning  the  fusion  junction  by 
analyzing  2  main  categories  of  sequence  reads;  chimera  candidates 
and  nonmapping  (Fig.  S2^).  The  resulting  chimera  candidates  from 
the  nonmapping  category  that  span  the  fusion  boundary  were 
merged  with  the  chimeras  found  to  encompass  the  fusion  boundary 
revealing  119, 144, 205,  and  294  chimeras  in  VCaP,  K562,  TIER,  and 
UHR,  respectively. 

Comparison  of  a  Paired-End  Strategy  Against  Existing  Singie  Read 
Approaches.  To  assess  the  merit  of  adopting  a  paired-end  transcrip- 
tome  approach,  we  compared  the  results  against  existing  single  read 
approaches.  Although  current  RNA  sequencing  (RNA-Seq)  stud¬ 
ies  have  been  using  36-nt  single  reads  (16,  17),  we  increased  the 
likelihood  of  spanning  a  fusion  junction  by  generating  100-nt  long 
single  reads  using  the  Illumina  Genome  Analyzer  II.  Also,  we  chose 
this  length  because  it  would  facilitate  a  more  comparable  amount 
of  sequencing  time  as  required  for  sequencing  both  50-nt  mate 
pairs.  In  total,  we  generated  7.0,  59.4,  and  53.0  million  lOO-nt 
transcriptome  reads  for  VCaP,  UHR,  and  HER,  respectively,  for 
comparison  against  paired-end  transcriptome  reads  from  matched 
samples. 

Eecause  the  UHR  is  a  mixture  of  cancer  cell  lines,  we  expected 
to  find  numerous  previously  identified  gene  fusions.  Therefore,  we 
first  assessed  the  depth  of  coverage  of  a  paired-end  approach 
against  long  single  reads  by  directly  comparing  the  normalized 
frequency  of  sequence  reads  supporting  4  previously  identified  gene 
fusions  [TMPRSS2-ERG  (5,  6),  BCR-ABLl  (18),  BCAS4-BCAS3 
(19),  and  ARFGEF2-SULF2  (20)].  As  shown  in  Fig.  L4,  we  ob¬ 
served  a  marked  enrichment  of  paired-end  reads  compared  with 
long  single  reads  for  each  of  these  well  characterized  gene  fusions. 

We  observed  that  TMPRSS2-ERG  had  a  >  10-fold  enrichment 
between  paired-end  and  single  read  approaches.  The  schematic 
representation  in  Fig.  IB  indicates  the  distribution  of  reads  con¬ 
firming  the  TMPRSS2-ERG  gene  fusion  from  both  paired-end  and 
single  read  sequencing.  As  expected,  the  longer  reads  improve  the 
number  of  reads  spanning  known  gene  fusions.  For  example,  had 
we  sequenced  a  single  36-mer  (shown  in  red  text),  11  of  the  17 
chimeras,  shown  in  the  bottom  portion  of  the  long  single  reads, 
would  not  have  spanned  the  gene  fusion  boundary,  but  instead, 
would  have  terminated  before  the  junction  and,  therefore,  only 
aligned  to  TMPRSS2.  However,  despite  the  improved  results  only 
17  chimeric  reads  were  generated  from  7.0  million  long  single  read 
sequences.  In  contrast,  paired-end  sequencing  resulted  in  552  reads 
supporting  the  TMPRSS 2-ERG  gene  fusion  from  =^17  million 
sequences. 

Eecause  we  are  using  sequence  based  evidence  to  nominate  a 
chimera,  we  hypothesized  that  the  approach  providing  the  maxi¬ 
mum  nucleotide  coverage  is  more  likely  to  capture  a  fusion  junc¬ 
tion.  We  calculated  an  in  silico  insert  size  for  each  sample  using 
mate  pairs  aligning  to  the  same  gene,  and  found  the  mean  insert  size 
of  =“200  nt.  Then,  we  compared  the  total  coverage  from  single  reads 
(coverage  is  equivalent  to  the  total  number  of  pass  filter  reads 
against  the  read  length)  with  the  paired-end  approach  (coverage  is 
equivalent  to  the  sum  of  the  insert  size  with  the  length  of  each  read) 
(Fig.  S2B).  Overall,  we  observed  an  average  coverage  of  848.7  and 
757.3  ME  using  single  read  technology,  compared  with  2,553.3  and 
2,363  ME  from  paired-end  in  UHR  and  HER,  respectively.  This 
increase  in  =“3-fold  coverage  in  the  paired-end  samples  compared 
with  the  long  read  approach,  per  lane,  could  explain  the  increased 
dynamic  range  we  observed  using  a  paired-end  strategy. 

Next  we  wanted  to  identify  chimeras  common  to  both  strategies. 
The  long  read  approach  nominated  1,375  and  1,228  chimeras, 
whereas  with  a  paired-end  strategy,  we  only  nominated  225  and  144 
chimeras  in  UHR  and  HER,  respectively.  As  shown  in  the  Venn 
diagram  (Fig.  1C),  there  were  32  and  31  candidates  common  to  both 


technologies  for  UHR  and  HER,  respectively.  Within  the  common 
UHR  chimeric  candidates,  we  observed  previously  identified  gene 
fusions  BCAS4-BCAS3,  BCR-ABEl,  ARFGEF2-SULF2,  and 
RPS6KB1-TMEM49  (13).  The  remaining  chimeras,  nominated  by 
both  approaches,  represent  a  high  fidelity  set.  Therefore,  to  further 
assess  whether  a  paired-end  strategy  has  an  increased  dynamic 
range,  we  compared  the  ratio  of  normalized  mate  pair  reads  against 
single  reads  for  the  remaining  chimeras  common  to  both  technol¬ 
ogies.  We  observed  that  93.5  and  93.9%  of  UHR  and  HER 
candidates,  respectively,  had  a  higher  ratio  of  normalized  mate  pair 
reads  to  single  reads  (Table  S2),  confirming  the  increased  dynamic 
range  offered  by  a  paired-end  strategy.  We  hypothesize  that  the 
greater  number  of  nominated  candidates  specific  to  the  long  read 
approach  represents  an  enrichment  of  false  positives,  as  observed 
when  using  the  454  long  read  technology  (15,  21). 

Paired-End  Approach  Reveals  Novel  Gene  Fusions.  We  were  inter¬ 
ested  in  determining  whether  the  paired-end  libraries  could  detect 
novel  gene  fusions.  Among  the  top  chimeras  nominated  from 
VCaP,  HER,  UHR,  and  K562,  many  were  already  known,  including 
TMPRSS2-ERG,  BCAS4-BCAS3,  BCR-ABFl,  USP10-ZDHHC7, 
and  ARFGEF2-SUFF2.  Also  ranking  among  these  well  known  gene 
fusions  in  UHR  was  a  fusion  on  chromosome  13  between  GAS6  and 
RASA3  (Fig.  ST4  and  Table  S2).  The  fact  that  GAS6-RASA3 
ranked  higher  than  BCR-ABFl  suggests  that  it  may  be  a  driving 
fusion  in  one  of  the  cancer  cell  lines  in  the  RNA  pool. 

Another  observation  was  that  there  were  2  candidates  among  the 
top  10  found  in  both  UHR  and  K562.  This  observation  was 
intriguing,  because  hematological  malignancies  are  not  considered 
to  have  multiple  gene  fusion  events.  In  addition  to  BCR-ABLl,  we 
were  able  to  detect  a  previously  undescribed  interchromosomal 
gene  fusion  between  exon  23  of  NUP214  located  at  chromosome 
9q34.13  with  exon  2  of  XKR3  located  at  chromosome  22qll.l.  Eoth 
of  these  genes  reside  on  chromosome  22  and  9  in  close  proximity 
to  BCR  and  ABLl,  respectively  (Fig.  S3R).  We  confirmed  the 
presence  of  NUP214-XKR3  in  K562  cells  using  qRT-PCR,  but  were 
unable  to  detect  it  across  an  additional  5  CML  cell  lines  tested 
(SUP-E15,  MEG-01,  KU812,  GDM-1,  and  Kasumi-4)  (Fig.  S3C). 
These  results  suggest  that  NUP214-XKR3  is  a  “private”  fusion  that 
originated  from  additional  complex  rearrangements  after  the  trans¬ 
location  that  generated  BCR-ABFl  and  a  focal  amplification  of 
both  gene  regions. 

Although  we  were  able  to  detect  BCR-ABLl  and  NUP214- 
XKR3  in  both  UHR  and  K562,  there  was  a  marked  reduction  in 
the  mate  pairs  supporting  these  fusions  in  UHR.  Although  a 
diluted  signal  is  expected,  because  UHR  is  pooled  samples,  it 
provides  evidence  that  pooling  samples  can  serve  as  a  useful 
approach  for  nominating  top  expressing  chimeras,  and  poten¬ 
tially  enrich  for  “driver”  chimeras. 

Previously  UndescrIbed  Prostate  Gene  Fusions.  Our  previous  work 
using  integrative  transcriptome  sequencing  to  detect  gene  fusions  in 
cancer  revealed  multiple  gene  fusions,  demonstrating  the  complex¬ 
ity  of  the  prostate  transcriptomes  of  VCaP  and  LNCaP  (15).  Here, 
we  exploit  the  comprehensiveness  of  a  paired-end  strategy  on  the 
same  cell  lines  to  reveal  novel  chimeras.  In  the  circular  plot  shown 
in  Fig.  S4A,  we  displayed  all  experimentally  validated  paired-end 
chimeras  in  the  larger  red  circle.  We  found  that  all  of  the  previously 
discovered  chimeras  in  VCaP  and  LNCaP  comprised  a  subset  of  the 
paired-end  candidates,  as  displayed  in  the  inner  black  circle. 

As  expected,  TMPRSS2-ERG  was  the  top  VCaP  candidate.  In 
addition  to  “rediscovering”  the  USP10-ZDHHC7,HJURP-INPP4A, 
and  EIF4E2-FIJURP  gene  fusions,  a  paired-end  approach  revealed 
several  previously  undescribed  gene  fusions  in  VCaP.  One  such 
example  was  an  interchromosomal  gene  fusion  between  ZDHHC7, 
on  chromosome  16,  v/ith  ABCB9,  residing  on  chromosome  12,  that 
was  validated  by  qRT-PCR  (Fig.  530).  Interestingly,  the  5'  partner, 
ZDHHC7,  had  previously  been  validated  as  a  complex  intrachro- 
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Fig.  1.  Dynamic  range  and  sensitivity  of  the  paired-end  transcriptome  analysis  relative  to  single  read  approaches.  (.4)  Comparison  of  paired-end  (blue)  and  long  single 
transcriptome  reads  (black)  supporting  known  gene  fusions  TMPRSS2-ERG,  BCR-ABLl,  BCAS4-BCAS3,  and  ARFGEF2-SULF2.  (S)  Schematic  representation  of  TMPRSS2- 
ERG  in  VCaP,  comparing  mate  pairs  with  long  single  transcriptome  reads.  (Upper)  Frequency  of  mate  pairs,  shown  in  log  scale,  are  divided  based  on  whether  they 
encompass  or  span  the  fusion  boundary;  (Lower)  100-mer  single  transcriptome  reads  spanning  TMPRSS2-ERG  fusion  boundary.  First  36  nt  are  highlighted  in  red.  (O 
Venn  diagram  of  chimera  nominations  from  both  a  paired-end  (orange)  and  long  single  read  (blue)  strategy  for  UHR  and  HER. 


mosomal  gene  fusion  with  USPIO  (15).  Both  fusions  have  mate 
pairs  aligning  to  the  same  exon  of  ZDHHC7  (15),  suggesting  that 
their  breakpoints  are  in  adjacent  introns  (Fig.  S3£)). 

Another  previously  undescribed  VCaP  interchromosomal  gene 
fusion  that  we  discovered  was  between  exon  2  of  7X47,  residing  on 
chromosome  2,  with  exon  3  of  DIRC2,  or  disrupted  in  renal 
carcinoma  2,  located  on  chromosome  3.  TIA1-DIRC2  was  validated 
by  qRT-PCR  and  FISH  (Fig.  S5).  In  total,  we  confirmed  an 


additional  4  VCaP  and  2  LNCaP  chimeras  (Fig.  S6).  Overall,  these 
fusions  demonstrate  that  paired-end  transcriptome  sequencing  can 
nominate  candidates  that  have  eluded  previous  techniques,  includ¬ 
ing  other  massively  parallel  transcriptome  sequencing  approaches. 

Distinguishing  Causal  Gene  Fusions  from  Secondary  Mutations.  We 

were  next  interested  in  determining  whether  the  dynamic  range 
provided  by  paired-end  sequencing  can  distinguish  known  high- 
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Fig.  2.  RNA  based  chimeras.  (A)  Heatmaps  showing  the  normalized  number  of  reads  supporting  each  read-through  chimera  across  sampies  ranging  fromO  (white) 
to  30  (red).  (Upper)  The  heatmap  highiights  broadiy  expressed  chimeras  in  UHR,  HBR,  VCaP,  and  K562.  (Lower)  The  heatmap  highiights  the  expression  of  the  top 
ranking  restricted  gene  fusions  that  are  enriched  with  interchromosomai  and  intrachromosomai  rearrangements.  (S)  iilustrative  exampies  ciassifying  RNA-based 
chimeras  into  (i)  read-throughs,  (/V)  converging  transcripts,  (;7i)  diverging  transcripts,  and  (/V)overiapping  transcripts.  (C  Upper)  Paired-end  approach  links  reads  from 
independent  genes  as  belonging  to  the  same  transcriptional  unit  (Right),  whereas  a  single  read  approach  would  assign  these  reads  to  independent  genes  (Left). 
(Lower)  The  single  read  approach  requires  that  a  chimera  span  the  fusion  junction  (Left),  whereas  a  paired-end  approach  can  link  mate  pairs  independent  of  gene 
annotation  (Right). 


level  “driving”  gene  fusions,  such  as  known  recurrent  gene  fusions 
BCR-ABLl  and  TMPRSS2-ERG,  from  lower  level  “passenger” 
fusions.  Therefore,  we  plotted  the  normalized  mate  pair  coverage 
at  the  fusion  boundary  for  all  experimentally  validated  gene  fusions 
for  the  2  cell  lines  that  we  sequenced  harboring  recurrent  gene 
fusions,  VCaP  and  K562.  As  shown  in  Fig.  S4B,  we  observed  that 
both  driver  fusions,  TMPRSS2-ERG  and  BCR-ABEl,  show  the 
highest  expression  among  the  validated  chimeras  in  VCaP  and 
K562,  respectively.  This  observation  suggests  a  paired-end  nomi¬ 
nation  strategy  for  selecting  putative  driver  gene  fusions  among 
private  nonspecific  gene  fusions  that  lack  detectable  levels  of 
expression  across  a  panel  of  samples  (15). 

Previously  Undescribed  Breast  Cancer  Gene  Fusions.  Our  ability  to 
detect  previously  undescribed  prostate  gene  fusions  in  VCaP  and 
LNCaP  demonstrated  the  comprehensiveness  of  paired-end  tran- 
scriptome  sequencing  compared  with  an  integrated  approach,  using 
short  and  long  transcriptome  reads.  Therefore,  we  extended  our 
paired-end  analysis  by  using  breast  cancer  cell  line  MCF-7,  which 
has  been  mined  for  fusions  using  numerous  approaches  such  as 
expressed  sequence  tags  (ESTs)  (22),  array  CGH  (23),  single 
nucleotide  polymorphism  arrays  (24),  gene  expression  arrays  (25), 
end  sequence  profiling  (20,  26),  and  paired-end  diTag  (PET)  (13). 

A  histogram  (Fig.  S4C)  of  the  top  ranking  MCF-7  candidates 
highlights  BCAS4-BCAS3  and  ARFGEF-SUEF2  as  the  top  2  rank¬ 
ing  candidates,  whereas  other  previously  reported  candidates,  such 
as  SULF2-PRICKLE,  DEPDC1B-ELOVE7,  RPS6KB1-TMEM49, 
and  CXoiflS-SYAPl,  were  interspersed  among  a  comprehensive  list 
of  previously  undescribed  putative  chimeras.  To  confirm  that  these 
previously  undescribed  nominations  were  not  false  positives,  we 
experimentally  validated  2  interchromosomai  and  3  intrachromo¬ 
somai  candidates  using  qRT-PCR  (Fig.  S6).  Overall,  not  only  was 


a  paired-end  approach  able  to  detect  gene  fusions  that  have  eluded 
numerous  existing  technologies,  it  has  revealed  5  previously  unde¬ 
scribed  mutations  in  breast  cancer. 

RNA-Based  Chimeras.  Although  many  of  the  inter  and  intrachromo¬ 
somai  rearrangements  that  we  nominated  were  found  within  a 
single  sample,  we  observed  many  chimeric  events  shared  across 
samples.  We  identified  11  chimeric  events  common  to  UHR,  VCaP, 
K562,  and  HBR  (Table  S3).  Via  heatmap  representation  (Fig.  TA) 
of  the  normalized  frequency  of  mate  pairs  supporting  each  chimeric 
event,  we  can  observe  these  events  are  broadly  transcribed  in 
contrast  to  the  top  restricted  chimeric  events.  Also,  we  found  that 
100%  of  the  broadly  expressed  chimeras  resided  adjacent  to  one 
another  on  the  genome,  whereas  only  7.7%  of  the  restricted 
candidates  were  neighboring  genes.  This  discrepancy  can  be  ex¬ 
plained  by  the  enrichment  of  inter  and  intrachromosomai  rear¬ 
rangements  in  the  restricted  set. 

Unlike,  previously  characterized  restricted  read-throughs,  such 
as  SLC45A3-EEK4  (15),  which  are  found  adjacent  to  one  another, 
but  in  the  same  orientation,  we  found  that  the  majority  of  the 
broadly  expressed  chimera  candidates  resided  adjacent  to  one 
another  in  different  orientations.  Therefore,  we  have  categorized 
these  events  as  (i)  read-throughs,  adjacent  genes  in  the  same 
orientation,  (ii)  diverging  genes,  adjacent  genes  in  opposite  orien¬ 
tation  whose  5'  ends  are  in  close  proximity,  (Hi)  convergent  genes, 
adjacent  genes  in  opposite  orientation  whose  3'  ends  are  in  close 
proximity,  and  (iv)  overlapping  genes,  adjacent  genes  who  share 
common  exons  (Fig.  2B).  Based  on  this  classification,  we  found  1 
read-through,  2  convergent  genes,  6  divergent  genes,  and  2  over¬ 
lapping  genes.  Also,  we  found  that  ^81.8%  of  these  chimeras  had 
at  least  1  supporting  EST,  providing  independent  confirmation  of 
the  event  (Table  S3).  In  contrast  to  paired-end,  single  read  ap- 
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proaches  would  likely  miss  these  instances  as  each  mate  would  have 
aligned  to  their  respective  genes  based  on  the  current  annotations 
(Fig.  2C).  Also,  these  instances  may  represent  extensions  of  a 
transcriptional  unit,  which  would  not  be  detectable  by  a  single  read 
approach  that  identifies  chimeric  reads  that  span  exon  boundaries 
of  independent  genes.  Overall,  we  believe  that  many  of  these 
broadly  expressed  RNA  chimeras  represent  instances  where  mate 
pairs  are  revealing  previously  undescribed  annotation  for  a  tran¬ 
scriptional  unit. 


Chromosome  16q13  Chromosome  21q21 .1 


Previously  Undescribed  ETS  Gene  Fusions  in  Clinically  Localized  Pros¬ 
tate  Cancer.  Given  the  high  prevalence  of  gene  fusions  involving 
ETS  oncogenic  transcription  factor  family  members  in  prostate 
tumors,  we  applied  paired-end  transcriptome  sequencing  for  gene 
fusion  discovery  in  prostate  tumors  lacking  previously  reported 
ETS  fusions.  For  2  prostate  tumors,  aT52  and  aT64,  we  generated 
6.2  and  7.4  million  transcriptome  mate  parrs,  respectively.  In  aT64, 
we  found  that  HERPUDl,  residing  on  chromosome  16,  juxtaposed 
in  front  of  exon  4  of  ERG  (Fig.  3A),  which  was  validated  by 
qRT-PCR  (Fig.  S6)  and  FISH  (Fig.  3B),  thus  identifying  a  third  5' 
fusion  partner  for  ERG,  after  TMPRSS2  (6)  and  SLC45A3  (27),  and 
presumably,  HERPUDl  also  mediates  the  overexpression  of  ERG 
in  a  subset  of  prostate  cancer  patients.  Also,  just  as  TMPRSS2  and 
SLC45A3  have  been  shown  to  be  androgen  regulated  by  qRT-PCR 
(5),  we  found  HERPUDl  expression,  via  RNA-Seq,  to  be  respon¬ 
sive  to  androgen  treatment  (Fig.  S7).  Also,  ChIP-Seq  analysis 
revealed  androgen  binding  at  the  5'  end  of  HERPUDl  (Fig.  S7). 

Also,  in  the  second  prostate  tumor  sample  (aT52),  we  discovered 
an  interchromosomal  gene  fusion  between  the  5'  end  of  a  prostate 
cDNA  clone,  AX747630  {FU35294),  residing  on  chromosome  17, 
with  exon  4  oiETVl,  located  on  chromosome  7  (Fig.  3C),  which  was 
validated  via  qRT-PCR  (Fig.  S6)  and  FISH  (Fig.  3D).  Interestingly, 
this  fusion  has  previously  been  reported  in  an  independent  sample 
found  by  a  fluorescence  in  situ  hybridization  screen  (27);  thus, 
demonstrating  that  it  is  recurrent  in  a  subset  of  prostate  cancer 
patients.  As  previously  reported,  gene  expression  via  RNA-Seq 
confirmed  t\\atAX747630  is  an  androgen-inducible  gene  (Fig.  S7). 
Also,  ChIP-Seq  revealed  androgen  occupancy  at  the  S'  end  of 
AX747630  (Fig.  S7). 

Discussion 

This  study  demonstrates  the  effectiveness  of  paired-end  massively 
parallel  transcriptome  sequencing  for  fusion  gene  discovery.  By 
using  a  paired-end  approach,  we  were  able  to  rediscover  known 
gene  fusions,  comprehensively  discover  previously  undescribed 
gene  fusions,  and  hone  in  on  causal  gene  fusions.  The  ability  to 
detect  12  previously  undescribed  gene  fusions  in  4  commonly  used 
cell  lines  that  eluded  any  previous  efforts  conveys  the  superior 
sensitivity  of  a  paired-end  RNA-Seq  strategy  compared  with  ex¬ 
isting  approaches.  Also,  it  suggests  that  we  may  be  able  to  unveil 
previously  undescribed  chimeric  events  in  previously  characterized 
samples  believed  to  be  devoid  of  any  known  driver  gene  fusions  as 
exemplified  by  the  discovery  of  previously  undescribed  ETS  gene 
fusions  in  2  clinically  localized  prostate  tumor  samples  that  lacked 
known  driver  gene  fusions. 

By  analyzing  the  transcriptome  at  unprecedented  depth,  we  have 
revealed  numerous  gene  fusions,  demonstrating  the  prevalence  of 
a  relatively  under-represented  class  of  mutations.  However,  one  of 
the  major  goals  remains  to  discover  recurrent  gene  fusions  and  to 
distinguish  them  from  secondary,  nonspecific  chimeras.  Although 
quantifying  expression  levels  is  not  proof  of  whether  a  gene  fusion 
is  a  driver  or  passenger,  because  a  low-level  gene  fusion  could  still 
be  causative,  it  still  of  major  significance  that  a  paired-end  strategy 
clearly  distinguished  known  high-level  driving  gene  fusions,  such  as 
BCR-ABLl  and  TMPRSS2-ERG,  from  potential  lower  level  pas¬ 
senger  chimeras.  Overall,  these  fusions  serve  as  a  model  for 
employing  a  paired-end  nomination  strategy  for  prioritizing  leads 
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Fig.  3.  Discovery  of  previously  undescribed  ETS  gene  fusions  in  localized 
prostate  cancer.  (A)  Schematic  representation  of  the  interchromosomal  gene 
fusion  between  exon  1  of  HERPUDl  (red),  residing  on  chromosome  16,  with  exon 
4  of  ERG  (blue),  located  on  chromosome  21.  (B)  Schematic  representation  show¬ 
ing  genomic  organization  of  HERPUDl  and  ERG  genes.  Horizontal  red  and  green 
bars  indicate  the  location  of  BAC  clones.  (Lower)  FISH  analysis  using  BAC  clones 
showing  HERPUDl  and  £R6  in  a  normal  tissue  (/.eft),  deletion  of  the  ERG  5'  region 
in  tumor  (Center),  and  HERPUDl-ERG  fusion  in  a  tumor  sample  (Right).  (Q 
Schematic  representation  of  the  interchromosomal  gene  fusion  between 
FU35294  (green),  residing  on  chromosome  17,  with  exon  4  of  ETVl  (orange) 
located  on  chromosome  21 .  (D  Upper)  Schematic  representation  of  the  genomic 
organization  of  FLJ35294  and  ETVl  genes.  (Lower)  FISH  analysis  using  BAC  clones 
showing  split  of  ETVl  in  tumor  sample  (Left)  and  the  colocalization  of  FU35294 
and  ETVl  in  a  tumor  sample  (Right). 


likely  to  be  high-level  driving  gene  fusions,  which  would  subse¬ 
quently  undergo  further  functional  and  experimental  evaluation. 
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One  of  the  major  advantages  of  using  a  transcriptome  approach 
is  that  it  enables  us  to  identify  rearrangements  that  are  not 
detectable  at  the  DNA  level.  For  example,  conventional  cytogenetic 
methods  would  miss  gene  fusions  produced  by  paracentric  inver¬ 
sions,  or  sub  microscopic  events,  such  as  GAS6-RASA3.  Also, 
transcriptome  sequencing  can  unveil  RNA  chimeras,  lacking  DNA 
aberrations,  as  demonstrated  by  the  discovery  of  a  recurrent, 
prostate  specific,  read-through  of  SLC45A3  with  ELK4  in  prostate 
cancers.  Further  classification  of  RNA  based  events  using  paired- 
end  sequencing  revealed  numerous  broadly  expressed  chimeras 
between  adjacent  genes.  Although  these  events  were  not  necessarily 
read-throughs  events,  because  they  typically  had  different  orienta¬ 
tions,  we  believe  they  represent  extensions  of  transcriptional  units 
beyond  their  annotated  boundaries.  Unlike  single  read  based 
approaches,  which  require  chimeras  to  span  exon  boundaries  of 
independent  genes,  we  were  able  to  detect  these  events  using 
paired-end  sequencing,  which  could  have  significant  impact  for 
improving  how  we  annotate  transcriptional  units. 

Overall,  we  have  demonstrated  the  advantages  of  employing  a 
paired-end  transcriptome  strategy  for  chimera  discovery,  estab¬ 
lished  a  methodology  for  mining  chimeras,  and  extensively  cata¬ 
logued  chimeras  in  a  prostate  and  hematological  cancer  models.  We 
believe  that  the  sensitivity  of  this  approach  will  be  of  broad  impact 
and  significance  for  revealing  novel  causative  gene  fusions  in 
various  cancers  while  revealing  additional  private  gene  fusions  that 
may  contribute  to  tumorigenesis  or  cooperate  with  driver  gene 
fusions. 

Methods 

Paired-End  Gene  Fusion  Discovery  Pipeline.  Mate  pair  transcriptome  reads  were 
mapped  to  the  human  genome  (hg18)  and  Refseq  transcripts,  allowing  up  to  2 
mismatches,  using  Efficient  Alignment  of  Nucleotide  Databases  (ELAND)  pair 
within  the  lllumina  Genome  Analyzer  Pipeline  software.  Illumina  export  output 
files  were  parsed  to  categorize  passing  filter  mate  pairs  as  {/)  mapping  to  the  same 
transcript,  {if)  ribosomal,  (//7)  mitochondrial,  (/V)  quality  control,  (v)  chimera  can¬ 
didates,  and  (v/)  nonmapping.  Chimera  candidates  and  nonmapping  categories 
were  used  for  gene  fusion  discovery.  For  the  chimera  candidates  category,  the 
following  criteria  were  used:  (/)  mate  pairs  must  be  of  high  mapping  quality  (best 
unique  match  across  genome),  {if)  best  unique  mate  pairs  do  not  have  a  more 
logical  alternative  combination  (i.e.,  best  mate  pairs  suggest  an  interchromo- 
somal  rearrangement,  whereas  the  second  best  mapping  for  a  mate  reveals  the 
pair  have  a  alignment  within  the  expected  insert  size),  (///)  the  sum  of  the 
distances  between  the  most  5'  and  3'  mate  on  both  partners  of  the  gene  fusion 
must  be  <500  nt,  and  (/V)  mate  pairs  supporting  a  chimera  must  be  nonredun- 
dant. 
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In  addition  to  mining  mate  pairs  encompassing  a  fusion  boundary,  the  non¬ 
mapping  category  was  mined  for  mate  pairs  that  had  1  read  mapping  to  a  gene, 
whereas  its  corresponding  read  fails  to  align,  because  it  spans  the  fusion  bound¬ 
ary.  First,  the  annotated  transcript  that  the  "mapping”  mate  pairaligned  against 
was  extracted,  because  this  transcript  represents  one  of  the  potential  partners 
involved  in  the  gene  fusion.  The  "nonmapping"  mate  pair  was  then  aligned 
against  all  of  the  exon  boundaries  ofthe  known  gene  pa  rtnerto  identify  a  perfect 
partial  alignment.  A  partial  alignment  confirms  that  the  nonmapping  mate  pair 
maps  to  our  expected  gene  partner  while  revealing  the  portion  of  the  nonmap¬ 
ping  mate  pair,  or  overhang,  aligning  to  the  unknown  partner.  The  overhang  is 
then  aligned  against  the  exon  boundaries  of  all  known  transcripts  to  identify  the 
fusion  partner.  This  process  is  done  using  a  Perl  script  that  extracts  all  possible 
University  of  California  Santa  Cruz  (UCSC)  and  Refseq  exon  boundaries  looking 
for  a  single  perfect  best  hit. 

Mate  pairs  spanning  the  fusion  boundary  are  merged  with  mate  pairs  encom¬ 
passing  the  fusion  boundary.  At  least  2  independent  mate  pairs  are  required  to 
support  a  chimera  nomination,  which  can  be  achieved  by  (/)  2  or  more  nonre- 
dundant  mate  pairs  spanning  the  fusion  boundary,  {if)  2  or  more  nonredundant 
mate  pairs  encompassing  a  fusion  boundary,  or  {iif)  1  or  more  mate  pairs  encom¬ 
passing  a  fusion  boundary  and  1  or  more  mate  pairs  spanning  the  fusion  bound¬ 
ary.  All  chimera  nominations  were  normalized  based  on  the  cumulative  number 
of  mate  pairs  encompassing  or  spanning  the  fusion  junction  per  million  mate 
pairs  passing  filter. 

RNA  Chimera  Analysis.  Chimeras  found  from  UHR,  HBR,  VCaP,  and  K562  were 
grouped  based  on  whether  they  showed  expression  in  all  samples,  "broadly 
expressed,"  or  a  single  sample,  "restricted  expression."  Because  UHR  is  comprised 
of  K562,  chimeras  found  in  only  these  2  samples  were  also  considered  as  re¬ 
stricted.  Heatmap  visualization  was  conducted  by  using  TIGR's  MultiExperiment 
Viewer  (TMeV)  version  4.0  (www.tm4.org). 

Additional  Details.  Additional  details  can  be  found  in  SI  Text. 
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Genomic  translocations  leading  to  the  expression  of  chimeric  transcripts  characterize  several  hematologic, 
mesenchymal  and  epithelial  malignancies.  While  several  gene  fusions  have  been  linked  to  essential 
molecular  events  in  hematologic  malignancies,  the  identification  and  characterization  of  recurrent  chimeric 
transcripts  in  epithelial  cancers  has  been  limited.  However,  the  recent  discovery  of  the  recurrent  gene  fusions 
in  prostate  cancer  has  sparked  a  revitalization  of  the  quest  to  identify  novel  rearrangements  in  epithelial 
malignancies.  Here,  the  molecular  mechanisms  of  gene  fusions  that  drive  several  epithelial  cancers  and 
the  recent  technological  advances  that  increase  the  speed  and  reliability  of  recurrent  gene  fusion  discovery 
are  explored. 
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1.  Introduction 

Throughout  history,  technological  advances  are  often  followed  hy 
discoveries  that  dramatically  alter  our  perceptions  of  disease  etiology. 
For  example,  after  the  term  “chromosome”  was  introduced  in  the  mid- 
1849’s,  several  German  pathologists  began  using  techniques  to 
compare  gross  mitotic  changes  in  tissue  sections  from  different 
human  malignancies  [1].  Almost  half  of  a  centuiy  later,  Theodore 
Boveri  published  a  critical  hypothesis  that,  “mammalian  tumors  might 
be  initiated  by  mitotic  abnormalities  that  resulted  in  a  change  in  the 
number  of  chromosomes  in  the  cell  (aneuploidy)”,  based  on  the 
observation  that  sea  urchin  embryos  would  frequently  engage  in 
uncommon  development  following  mitotic  abnormality  [2],  As  time 
passed,  breakthroughs  arose  that  dramatically  increased  the  quality 
and  reproducibility  of  cytogenetic  techniques  such  as  the  use  of 
colchicine,  which  arrests  cells  in  mitosis  by  inhibiting  microtubule 
assembly.  As  a  result  of  these  observations,  the  general  hypotheses 
regarding  the  evolution  of  human  disease  became  increasingly 
complex;  particular  pathological  conditions  were  associated  with 
specific  chromosomal  abnormalities,  such  as  Lejeune's  association  of 
Down  syndrome  with  an  extra  copy  of  chromosome  21  [3,4], 

Advances  in  technology  once  again  spurred  discovery  when,  in 
1958,  Rothfels  and  Siminovitch  published  a  new  cytogenetic,  air¬ 
drying  technique  for  flattening  chromosomes  [5],  The  application  of 
this  technology  later  allowed  Hungerford  and  Nowell  to  further 
characterize  their  initial  observation  that  two  patients  with  chronic 
myelogenous  leukemia  (CML)  had  a  characteristic  small  chromosome 
[6],  Soon  after  the  initial  publication,  Hungerford  and  Nowell  were 
able  to  report  on  a  series  of  seven  patients,  all  of  which  harbored  this 
minute  chromosome  [7],  This  was  coined  the  “Philadelphia  chromo¬ 
some”  after  the  city  in  which  the  abnormal  chromosome  was 
discovered  in  accord  with  the  Committee  for  the  Standardization  of 
Chromosomes  [8],  The  rearrangement  leading  to  the  Philadelphia 
chromosome  was  eventually  characterized  as  a  translocation  between 
chromosomes  9  and  22  [9],  resulting  in  the  fusion  of  the  breakpoint 
cluster  region  (BCR)  gene  on  chromosome  22  with  the  v-abl  Abelson 
murine  leukemia  viral  oncogene  homolog  (ABU)  gene  on  chromo¬ 
some  9  [  10],  Later  in  1990,  Lugo  et  al.  demonstrated  that  the  BCR-ABLl 
fusion  protein  is  an  active  tyrosine  kinase,  through  immunoblotting 
cell  lysates  from  Rat  1  transfected  cells,  revealing  that  cells  transfected 
with  either  BCR-ABLl  or  v-src,  but  not  v-H-ras  or  v-myc,  had  a 
significant  increase  in  total  phosphotyrosine  content  [11],  Under¬ 
standing  the  molecular  mechanism  of  BCR-ABLl  led  to  the  develop¬ 
ment  of  one  of  the  first  molecularly  tailored  therapies  as  the  small 
molecule  Imatinib  was  specifically  selected  for  its  ability  to  inhibit 
BCR-ABLl  kinase  activity  [12,13],  The  success  of  treating  chronic 
myelogenous  leukemia  with  a  specific  inhibitor  of  the  BCR-ABLl 
chimera  led  to  a  strong  interest  in  the  discovery  of  novel  gene  fusions 
in  other  cancer  subtypes  with  the  long  term  goal  of  designing  disease 
specific  therapeutics. 

As  techniques  like  the  use  of  chromosome  banding  for  karyotypic 
analysis  were  improved,  the  impact  on  discovery  of  novel  gene  fusions 
was  immediately  evident  in  leukemias  and  lymphomas,  in  fact,  while 
BCR-ABLl  is  perhaps  the  most  famous  gene  fusion,  the  first 
molecularly  characterized  chimera  was  discovered  by  Zech  et  al. 
through  the  use  of  karyotypic  analysis  and  is  actually  involved  in  the 
pathogenesis  of  Burkett's  lymphoma  and  was  identified.  While 
this  karyotypic  analysis  demonstrated  absence  of  the  distal  region 
on  the  long  arm  of  chromosome  8  and  an  extra  band  in  the  long 
arm  chromosome  14  distal  segment  [14],  the  genes  involved  in  the 


rearrangement  remained  elusive  until  1982  when  it  was  demon¬ 
strated  that  the  translocation  altered  the  c-MYC  oncogene  [15]  and 
that  the  promoter  and  5'  region  of  the  immunoglobulin  heavy  chain 
(/CH)  gene  were  rearranged  such  that  the  IGH  promoter  controls  c- 
MVC  expression  [16[.  Although  this  fusion  does  not  lead  to  a  chimeric 
protein,  it  was  demonstrated  that  aberrant  c-MYC  expression  through 
the  IGH  promoter  is  a  necessary  component  of  malignant  transforma¬ 
tion  in  Burkett’s  lymphoma  [17]. 

As  with  lymphoma  research,  karyotypic  analysis  rapidly  led  to  the 
identification  of  recurrent  breakpoints  that  seemed  to  characterize 
subsets  of  myeloid  leukemia.  For  example,  in  1973,  the  acute  myeloid 
leukemia  1  (AMU )  gene  was  cloned  from  the  breakpoint  region  of  the 
first  recurrent  translocation  described  in  leukemia,  t(8;21)  [18[.  in 
1991,  the  AMU  gene  was  found  to  be  fused  to  the  eight-twenty  one 
(ETO)  gene  on  chromosome  21,  which  is  also  known  as  runt-related 
transcription  factor  1  translocated  to  1  (RUNXITI)  [19,20]. 

As  the  techniques  of  molecular  biology  improved,  it  became  easier 
and  easier  to  obtain  the  DNA  sequence  adjacent  to  chromosomal 
breakpoints.  Since  the  original  identification  of  AMU  in  myeloid 
leukemia,  over  10  genes  have  been  described  to  participate  in 
rearrangements  with  AMU  [21  [.  In  fact,  advances  in  sequencing 
technology  led  to  the  realization  that  several  genes  are  recurrently  and 
promiscuously  fused  to  multiple  partners;  the  examples  of  which  are 
ever  increasing,  in  addition  to  AMU,  the  other  notable  example  of  a 
promiscuous  fusion  gene  partner  is  the  mixed-lineage  leukemia  (MIL) 
gene,  which  is  involved  in  over  40  different  rearrangements  (reviewed 
in  [22]).  In  fact,  because  of  the  variety  and  difficulty  of  discussing  all 
chromosomal  aberrations  in  human  malignancies,  Mitelman  et  al. 
maintain  and  frequently  update  an  online  database  of  rearrangements 
and  chromosome  aberrations  from  all  malignant  neoplasms  [23]. 

With  the  rapid  development  of  current  technologies  like  high- 
throughput  sequencing,  our  perceptions  as  to  the  origins  of  disease 
have  revealed  a  critical  involvement  of  chromosomal  aberrations,  in 
particular,  the  role  of  translocations  and  gene  fusions  in  malignant 
development.  With  a  better  understanding  of  the  role  of  these 
chromosomal  aberrations,  therapies  designed  to  inhibit  the  molecular 
function  of  chimeric  proteins  have  recently  been  developed  and,  like 
Imatinib,  some  have  demonstrated  a  window  of  strong  efficacy. 
Consequently,  much  hope  has  been  generated  by  the  potential  for 
targeting  existing  and  novel  gene  fusions  that  characterize  specific 
cancer  subtypes  with  rationally  designed  molecularly  tailored 
therapies.  Here,  we  review  known  genomic  rearrangements  in 
epithelial  tumors  that  led  to  aberrant  expression  of  chimeric 
transcripts  and  the  emerging  technologies  that  may  lead  to  the 
identification  of  novel  gene  fusions. 

2.  Gene  fusions  in  epithelial  cancers 

In  order  to  highlight  the  number  of  genomic  rearrangements 
leading  to  fusion  genes  that  characterize  epithelial  cancers,  we  have 
surveyed  some  of  the  well-studied  chimeras  from  several  solid 
malignancies  and  describe  the  fusions  in  approximate  chronological 
order  (Fig.  1 ).  In  the  ensuing  sections,  we  will  analyze  concepts  from  a 
global  view  of  epithelial  gene  fusions  with  a  few  case  studies  of 
rearrangements  from  leukemia  and  endometrial  stromal  tumors. 
Gene  fusions  will  be  categorized  into  three  different  types;  (1)  those 
which  alter  the  transcriptional  regulation,  (2)  those  which  alter  mRNA 
regulation  and  (3)  those  which  alter  protein  activity.  This  will  be 
followed  by  a  discussion  of  the  potential  reasons  why  gene  fusions 
have  not  been  in  the  limelight  of  solid  tumor  pathogenesis  and  the 
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Fig.  1.  Chronology  of  gene  fusion  discoveries  in  epithelial  cancers. 


developing  technologies  that  are  being  used  to  find  novel  recurrent 
gene  fusions  in  common  epithelial  tumors. 

2.3.  RET-NTRKl 

The  initial  discovery  of  an  epithelial  gene  fusion  in  mid-1989 
comes  directly  from  a  novel  screening  technique  used  to  identify 
transforming  oncogenes.  In  this  experimental  approach,  immortalized 
N1H3T3  cells  were  transfected  with  fragments  of  tumor  cell  genomic 
DNA,  plated  in  soft  agar.  DNA  is  then  isolated  from  cells  and  sequenced 
or  sub-cloned  to  identify  critical  fragments.  Using  this  approach, 
Martin-Zanca  et  al.  identified  the  RET-NTRKl  genomic  translocation, 
providing  some  of  the  first  insights  into  the  possibility  that  recurrent 
genomic  rearrangements  were  not  specifically  of  hematologic  phe¬ 
nomena  [24], 

RET  (rearranged  during  transfection)  encodes  a  tyrosine  kinase 
[25,26]  that  was  originally  identified  through  transfection  of  DNA 
from  a  human  T-cell  lymphoma  into  N1H3T3  cells  [27],  NTRKl  is  a 
membrane-bound  tyrosine  kinase  receptor  that  regulates  neuronal 
cell  growth,  differentiation,  and  programmed  cell  death  pathways 
[28],  Fusion  of  these  two  genes  results  in  loss  of  the  NTRKl  signal 
sequence  giving  rise  to  cytoplasmic  localization  and  constitutive 
activation  of  the  fusion  protein  [29],  Interestingly,  although  NTRKl 
was  the  first  identified  RET  fusion  partner,  RET  has  several  other  N- 
termlnal  fusion  partners  including  H4  [30,31],  R3a  [32],  REGS  [33]  and 
ELEl  [34,35].  One  possible  explanation  for  the  diversity  of  genomic 
rearrangements  obsei'ved  in  PTC  is  that  the  underling  pathology  is 
simply  dependent  on  deregulation  of  either  the  RET  or  NTRKl  tyrosine 
kinase  domain  (reviewed  in  [36]).  Consequently,  the  important 
determining  event  in  PTC  carcinogenesis  may  be  constitutive  activa¬ 
tion  of  the  mitogen-activated  protein  kinase  (MAPK)  signaling 
pathway,  which  can  be  caused  by  rearrangement  of  either  the  RET 
and/or  NTRKl  gene.  One  reason  for  this  hypothesis  is  that  while  the 
RET-NTRKl  rearrangement  appears  to  be  the  predominant  gene  fusion 
responsible  for  childhood  PTC,  In  adult-onset  populations  activating 
point  mutations  in  the  BRAE  gene  or,  controversially,  the  RAS  gene 
[37-43],  also  lead  to  constitutive  activation  of  the  MAPK  pathway 
without  RET  and/or  NTRKl  genomic  rearrangement  [44]. 

In  addition  to  differences  in  the  age-related  molecular  onset  of  PTC, 
the  proportion  of  cases  with  either  a  RET  or  NTRKl  rearrangement  also 
appears  to  be  based  on  the  geographic  area  of  origin  [45-47],  possibly 
because  thyroid  cancer  is  established  to  be  associated  with  exposure 
to  ionizing  radiation  [37,48].  Indeed,  studies  of  patient  populations 
exposed  to  either  the  Chernobyl  nuclear  power  plant  accident  [49,50] 


or  the  atomic  bombings  [51]  have  demonstrated  that  genomic 
rearrangements  occur  at  a  higher  frequency  than  mutations  following 
extreme  exposure  to  radiation  [37,48],  suggesting  that  under  certain 
biological  conditions  exposure  to  high  dose  radiation  may  actually 
trigger  specific  DNA  breaks  leading  to  intentional  genomic  rearrange¬ 
ment.  In  fact,  the  fusion  proteins  that  characterize  PTC  contain  a 
number  of  different  N-terminal  partners  fused  the  C-terminal  tyrosine 
kinase  domain  of  either  RET  or  NTRKl  [52]  that  may  depend  on  the 
environmental  cues  leading  to  genomic  rearrangement. 

2.2.  CTNNBl-PLAGl 

Within  a  year  of  publication  of  the  RET-NTRKl  genomic  rearrange¬ 
ment  in  PTC,  another  epithelial  translocation  was  reported  in 
pleomorphic  adenoma  (PA)  [53],  a  slow-growing  epithelial  tumor 
that  is  responsible  for  more  than  50%  of  salivary  gland  tumors  [54],  but 
less  than  10%  of  tumors  from  the  head  and  neck  [55].  In  contrast  to 
RET-NTRKl  which  was  discovered  by  a  screening  technique,  rearran¬ 
gements  in  PA  were  first  identified  by  karyotypic  analysis  of  primary 
tumors.  In  fact,  before  any  of  the  breakpoint  genes  were  identified,  PAs 
were  already  divided  into  four  cytogenetic  groups  (reviewed  in  [56]). 
Rearrangements  of  8ql2  account  for  about  40%  of  PAs  with  t(3;8) 
(p21;ql2)  comprising  about  half  of  rearrangements  at  this  locus. 
Translocations  of  12ql4-15  account  for  about  8%  of  PAs  with  t(9;12) 
(pl2-22;ql3-15)  or  an  ins(9;12)(pl2-22;ql3-15)  responsible  for 
these  abnormalities  [57,58].  Tumors  with  non-recurrent  clonal 
changes  comprise  about  20%  of  PAs,  and  tumors  with  apparently 
normal  karyotypes  account  for  the  remaining  cases  [56[. 

Almost  20  years  after  the  initial  karyotyping  studies,  Kas  et  al.  used 
a  comprehensive  breakpoint  mapping  approach,  southern  blot 
analysis  and  5'  rapid  amplification  of  cDNA  ends  (5'  RACE)  to  identify 
the  genes  involved  in  the  most  prevalent  PA  translocation,  t(3;8)(p21 ; 
ql2)  as  fi-Catenin  (CTNNBl)  and  PLAGl  (pleomorphic  adenoma  gene 
1)  [59[.  Specifically,  the  t(3;8)(p21;ql2)  rearrangement  fuses  the  fi- 
Catenin  {CTNNBl )  promoter  and  exon  1  to  PLAGl  exon  2,  resulting  in  a 
marked  increase  in  PLAGl  expression  (Fig.  2),  As  such,  because  the 
gene  fusion  results  in  altered  DNA  level  regulation  of  PLAGl  transcript, 
this  gene  fusion  is  characterized  as  type  1.  Interestingly,  the  reciprocal 
translocation  links  the  PLAGl  promoter  and  exon  1  to  fi-Catenin  exon 
2,  reducing  (3-Catenin  expression.  As  (3-Catenin  signals  through 
several  well-characterized  oncogenic  pathways  (reviewed  in  [60]), 
the  reduction  in  fi-Catenin  is  curious.  PLAGl,  however,  belongs  to  the 
PLAG  family  of  proteins  and  encodes  a  zinc  finger  protein  with  two 
putative  nuclear  localization  signals  and  can  bind  to  either  DNA  or 
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Fig.  2.  Genomic  structure  of  gene  fusions  with  altered  transcriptional  regulation.  The  CTNNBl-PLACl  and  TMPRSS2-ERC  chimeras  represent  an  important  class  of  gene  fusions  in 
which  the  proto-oncogene  remains  largely  intact,  but  the  genomic  rearrangement  places  a  new  promoter  and  5'-UTR  upstream  of  the  main  coding  sequence,  leading  to  aberrant 
expression  of  the  proto-oncogene. 


RNA.  Forced  expression  of  PLAGl  in  NIH3T3  cells  has  demonstrated 
that  this  protein  can  induce  the  standard  characteristics  of  neoplastic 
transformation  including  loss  cell-cell  contact  inhibition,  anchorage- 
independent  growth,  and  tumor  formation  in  nude  mice  xenografts 
[61],  This  suggests  that  the  constitutive  activity  of  the  CTNNBl 
promoter  leads  to  sufficient  PLAGl  expression  for  malignant  trans¬ 
formation  in  PA. 

2.3.  PRCC-TFE3 

As  cloning  and  molecular  strategies  improved  in  the  early  1999's, 
another  recurrent  gene  fusion  would  soon  he  described  in  papillary 
renal  cell  carcinoma  (PRCC),  the  second  most  common  carcinoma  of 
the  renal  tubules  accounting  for  15-20%  of  all  renal  cell  carcinomas 
[62-66],  Karyotypic  analysis  as  early  as  1986  (de  Jong  et  al.)  led  to  the 
identification  of  abnormalities  in  the  Xpll.2  region  characterized  by  a 
genomic  rearrangement,  t(X;l)(pll.2;q2f.2)  [62-66],  Interestingly, 
before  any  of  the  genes  surrounding  the  breakpoint  were  cloned  a 
gene  encoding  TFE3,  which  was  originally  identified  by  their  ability  to 
bind  to  pE3  elements  in  the  immunoglobin  heavy  chain  intronic 
enhancer  [67],  was  mapped  to  the  Xpll.22  locus  [68],  and  later  shown 
to  encode  a  member  of  the  basic  helix-loop-helix  followed  by  a 
leucine  zipper  family  (bHLHzip)  of  transcription  factors.  After  the 
original  genomic  mapping,  TFE3  was  soon  identified  at  the  transloca¬ 
tion  breakpoint  by  southern  blot  analysis  [69],  Subsequent  5'-RACE 
identified  PRCC;  a  ubiquitously  expressed  gene  that  encodes  a  protein 
with  a  high  proportion  of  prolines  and  glycines  —  including  three  P-X- 
X-P  motifs  that  are  known  to  interact  with  SH3  domains  [70,71]. 
Interestingly,  the  fusion  event  leading  to  the  PRCC-TFE3  rearrange¬ 
ment  also  results  in  a  reciprocal  TFE3-PRCC  gene  fusion  [69,72]. 

To  elucidate  the  properties  of  these  reciprocal  gene  fusions, 
Weterman  et  al.  introduced  wild  type  PRCC,  wild  type  TFE3,  PRCC- 
TFE3  and  TFE3-PRCC  expression  vectors  into  COS  cells  and  postulated 
that  only  the  PRCC-TFE3  gene  fusion  was  responsible  for  tumor 
formation  based  on  its  ability  to  activate  a  generalized  report  assay 


[73].  Thus,  the  PRCC-TFE3  genomic  rearrangement  is  type  3  as  the 
fusion  protein  gained  a  novel  function  through  rearrangement. 
However,  fusions  of  the  PSF  or  NonO  pre-mRNA  splicing  factors  are 
also  recurrently  fused  to  TFE3,  albeit  at  a  much  lower  frequency  than 
PRCC  [69,72,74],  suggesting  that  the  TFE3  portion  of  the  fusion  is 
responsible  for  malignant  transformation.  Subsequent  transcriptional 
activation  assays  demonstrated  that  of  the  PSF-TFE3,  NonO-TFE3  and 
PRCC-TFE3  chimeras,  only  the  PRCC-TFE3  fusion  protein  could  activate 
the  plasminogen  activator  inhibitor-1  (PAI-1)  promoter  [75],  suggest¬ 
ing  that  only  this  gene  fusion  retains  transcriptional  activity.  However, 
recent  co-immunoprecipitation  experiments  demonstrated  that  anti¬ 
bodies  against  the  pre-mRNA  splicing  factors  SC35,  PRLl,  and  CDC5 
were  able  to  immunoprecipitate  wild  type  PRCC,  and  an  anti-SM 
antibody  was  able  to  immunoprecipitate  the  PRCC-TFE3  fusion 
protein  [75].  This  data  suggests  that  the  fusion  protein  functions 
may  partially  function  through  transcriptional  pathways,  it  may  also 
function  by  altering  pre-mRNA  splicing,  but  more  conclusive  experi¬ 
ments  need  to  be  conducted  to  demonstrate  this  phenotype. 

2.4.  FIMGA2,  evading  let-7 

While  most  of  the  gene  fusions  discovered  until  this  point 
including  PRCC-TPE3  were  thought  to  define  specific  epithelial 
tumor  types,  a  new  gene  fusion  that  was  associated  with  several 
different  tumor  types,  including  pleomorphic  adenoma  (PA)  (see 
above),  lipoma,  uterine  leiomyoma  and  some  myeloid  malignancies 

[76] ,  would  refute  the  notion.  In  fact,  the  discoveiy  of  translocations 
involving  12ql5  had  been  established  by  karyotypic  analysis  in 
multiple  tumor  types  before  the  rearranged  genes  were  actually 
identified  and  one  of  the  genes  involved  in  the  t(9;12)(pl2-22;ql3- 
15)  PA  translocation  was  first  identified  in  both  mesenchymal  tumors 

[77]  and  lipomas  [78[.  This  first  gene  to  be  described  was  the  5'  gene 
fusion  partner,  FtMGA2  (high  mobility  group  AT-hook  2),  belongs  to 
the  non-histone  chromosomal  high  mobility  group  (HMG)  protein 
family,  which  are  small  nuclear  proteins  (<30  kDa)  that  undergo 
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extensive  post-translational  modifications  and  contain  nine  amino 
acid  segments  that  bind  AT-rich  DNA  stretches  in  the  minor  groove 
(AT-hooks)  (reviewed  in  [79]).  Subsequent  3'  RACE  of  tumor  samples 
revealed  that  HMGA2  has  two  different  3'  partners  in  PA,  FHIT  and 
NFIB,  both  of  which  contribute  veiy  little  coding  sequence  to  the 
resulting  fusion  gene.  In  fact,  in  one  class  of  translocations,  FIMGA2 
exon  3  is  fused  to  FH/T  exon  9  or  10,  resulting  in  retention  of  the  C- 
terminal  26  amino  acids  of  FFIIT  [80],  and  in  the  other  set,  FIMGA2 
exon  3  or  4  fusion  to  NFIB  exon  9  appends  five  amino  acids  (SWYLG) 
to  the  truncated  HMGA2  protein  [81]. 

Surprisingly,  transgenic  mice  overexpressing  wild  type  HMGA2 
were  observed  to  have  similar  phenotypes  to  mice  expressing  the 
truncated  protein  HMGA2  protein  found  in  the  PA  gene  fusions  [82- 
84],  To  complicate  this  observation,  in  hereditary  renal  cell  carcinoma, 
FFIIT  was  previously  demonstrated  to  be  fused  to  the  patched  related 
gene  TRC8  by  t(3;8)(pl4.2;q24.1)  [85,86]  and  the  (SWYLG)  amino 
acid  motif  found  in  the  FIIVIGA2-NFIB  gene  fusion  were  shown  to  be 
essential  for  NFIB  function  [81].  Recent  research,  however,  has  shed 
light  onto  the  importance  of  these  translocations  to  neoplastic 
transformation. 

The  discovery  that  small  RNAs  called  microRNAs  can  negatively 
regulate  gene  expression  through  direct  binding  to  a  gene’s  3'-UTR 
has  led  to  the  hypothesis  that  certain  microRNAs  can  function  as 
tumor  suppressors  in  cancer  [87[.  Bioinformatic  analysis  of  the 
HMGA2  3'-UTR  demonstrated  that  the  mRNA  contains  seven 
conserved  sites  complementary  to  the  let-7  microRNA  [88]  (depicted 
in  Fig.  3).  To  show  that  the  let-7  microRNA  negatively  influences 
HMGA2  expression,  Mayr  et  al.  built  a  HI\/IGA2  3'-UTR  conjugated 


luciferase  reporter  and  demonstrated  that  let-7  represses  its  expres¬ 
sion  [89],  As  such,  although  the  genomic  rearrangements  between 
HMGA2  and  FFIIT  or  NFIB  yield  fusion  proteins,  replacement  of  a  Let-7 
regulated  3'-LITR  seems  to  be  the  critical  event  because  it  leads  to 
HMGA2  overexpression,  which  is  sufficient  for  neoplastic  transforma¬ 
tion.  Thus,  the  HMGA2  genomic  rearrangement  represent  the  first  of  a 
novel  class  of  gene  fusions,  type  2,  in  which  fusion  gene  activity  is 
enhanced  by  loss  of  mRNA  level  regulation  (Fig.  3). 

2.5.  PaxB-PPARy 

In  2000,  Kroll  et  al.  employed  fluorescence  in  situ  hybridization 
(FISH),  yeast  artificial  chromosome  mapping  and  3'  RACE  to  identify 
genes  involved  in  a  genomic  rearrangement,  t(2;3)(ql3;p25)  [90], 
that  was  originally  identified  by  karyotype  analysis  of  follicular 
thyroid  carcinomas,  a  subset  (10-20%)  of  all  thyroid  malignancies 
[91  [.  This  translocation  is  thought  to  be  specific  to  FTC  as  it  has  not 
been  reported  in  other  thyroid  tumors  or  hyperplastic  nodules  [92].  In 
the  resulting  gene  fusion,  the  Pax8  (Paired  box  gene  8)  gene  is  fused  to 
PPARy  (Peroxisome  proliferator-activated  receptor-7),  3  ubiquitously 
expressed  transcription  factor  [90[.  The  Pax8  protein  is  involved  in 
thyroid  follicular  cell  development  and  regulation  of  thyroid-specific 
gene  expression  [93],  PPARy  plays  a  major  role  in  a  number  of 
different  diseases  including  obesity,  atherosclerosis,  diabetes  as  well 
as  cancer  (reviewed  in  [94]).  Because  Pax8  is  a  thyroid-specific 
transcription  factor  and  because  its  DNA  binding  domain  is  fused 
to  the  c-terminal  domains  of  PPARy  [90],  the  resulting  protein  chimera 
is  thought  to  have  constitutive  re-distribution  of  PPARy-directed 
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Fig.  3.  HMGA2  gene  fusions  elude  the  Let-7  family  of  microRNAs.  The  HIV1GA2  mRNA  structure  is  shown  along  with  putative  Let-7  family  binding  sequences  in  the  HMGA2  3'-L)TR. 
Results  were  predicted  byTargetScan  [202]  and  three  representative  microRNAs  are  shown  with  there  highest  probability  binding  sites  of  the  seven  total  predicted  sites  along  the  3' 
UTR.  Distance  to  each  predicted  binding  site  is  annotated  as  nucleotides  from  the  start  of  the  3'UTR.  Below  the  wild  type  HIV1GA2  mRNA  are  the  HMGA2-FHIT  and  HMGA2-NFIB 
mRNAs  that  result  from  these  two  gene  fusions.  TargetScan  did  not  predict  any  microRNA  binding  sites  in  these  genes.  As  such,  the  HMGA2  gene  fusions  represent  a  second  class  of 
gene  fusions  in  which  the  recombination  event  allows  the  proto-oncogene  mRNA  to  evade  microRNA-mediated  silencing. 
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transcription.  In  2005,  gene-expression  microarray  profiling  revealed 
a  distinct  signature  in  follicular  thyroid  carcinomas  harboring  the 
Pax8-PPARy  gene  fusion  in  which  cell  growth  and  chromatin 
remodeling  pathways  were  over-represented  and  protein  biosynth¬ 
esis  pathways  were  under-represented  as  compared  to  follicular 
thyroid  carcinomas  without  the  translocation  [95],  suggesting  that 
PPAKy-transcription  is  indeed  redefined  by  the  gene  fusion. 

Interestingly,  follicular  thyroid  carcinomas  were  originally  thought 
to  arise  from  disruption  of  distinct  molecular  pathways,  either 
through  the  fusion  of  Pax8  to  PPARy,  or  through  the  acquisition  of 
point  mutations  leading  to  the  constitutive  activation  of  the  G-protein 
RAS.  In  fact,  one  study  reported  that  16/33  (49%)  of  follicular 
carcinomas  had  RAS  mutations,  12/33  (36%)  had  Pax8-PPARy 
rearrangement,  only  1/33  (3%)  had  both,  and  4/33  (12%)  had  neither 
[96],  However,  in  2006,  quantitative  reverse  transcription  PCR  analysis 
of  follicular  carcinoma  clinical  samples  demonstrated  loss  of  the 
tumor  suppressor  NOREIA  in  samples  harboring  the  Pax8-PPARy 
rearrangement,  but  not  in  other  samples  [97],  Because  NOREIA 
binds  to  the  GTP  bound  (activated)  RAS  protein  and  suppresses  RAS 
activity,  this  discovery  suggested  that  activation  of  the  RAS  pathway  is 
a  critical  event  in  pathogenesis  of  thyroid  carcinoma  that  is  altered 
either  directly  by  activating  mutation,  or  indirectly  by  the  Pax8-PPARy 
rearrangement. 

2.6.  BRD-NUT 

Soon  after  the  discovery  of  the  Pax8-PPARy  rearrangement,  the 
translocation  t(15;19)(ql3;pl3.1)  was  identified  in  a  rare,  highly 
aggressive  carcinoma  arising  in  the  midline  organs  and  upper 
respiratory  tract  of  young  people  now  termed  nuclear  protein  in 
testis  (NUT)  midline  carcinomas  (NMC)  [98-100],  BRD4,  which 
contains  the  chromosome  19  breakpoint,  has  two  annotated  tran¬ 
scripts  encoding  either  short  or  long  forms  of  the  protein  that  both 
contain  N-terminal  bromodomains.  The  longer  BRD4  transcript 
encodes  a  ubiquitously  expressed  200  kDa  nuclear  protein  [101] 
with  a  c-terminal  lysine  rich  region  that  is  not  found  in  the  shorter 
transcript.  The  translocation  resulting  in  fusion  to  the  NUT  gene 
(identified  by  southern  blot  analysis)  only  disrupts  the  longer  BRD4 
transcript  resulting  in  the  loss  of  the  lysine  rich  region  in  the  fusion 


oncogene.  Several  studies  of  BRD4  in  both  murine  and  human  cell  line 
models  have  demonstrated  a  critical  role  in  cell  cycle  progression  and 
cell  proliferation  [102,103].  In  fact,  Brd4  enhances  cell  growth  by 
interacting  with  chromatin  [104],  replication  factor  C  [102]  and 
cyclinTl  and  CDKl  that  constitute  core  positive  transcription  elonga¬ 
tion  factor  b  (P-TEFb)  [105].  Likewise,  chromatin  immunoprecipita- 
tion  assays  demonstrated  that  Brd4  is  required  to  recruit  P-TEFb  to 
active  promoters,  and  that  increased  Brd4  leads  to  increased  P-TEFb- 
dependent  phosphorylation  of  RNA  polymerase  and  enhanced 
transcription  from  promoters  in  vivo  [105]. 

More  insight  into  the  role  of  the  BRD4-NUT  fusion  protein  in  NMC 
biology  came  from  a  screen  for  other  NMC  gene  fusions.  Because  the 
BRD4-NUT  translocation  defines  two-thirds  of  all  NMCs,  French  et  al. 
used  a  candidate  gene  approach  to  screen  other  NMC  samples  and 
discovered  another  recurrent  translocation  between  BRD3  and  NUT 
that  defined  large  portion  of  the  remaining  NMC  cases  [106].  The 
BRD3-NUT  fusion  gene  encodes  a  protein  highly  similar  to  that 
encoded  by  the  BRD4-NUT  transcript.  It  is  composed  of  two  tandem 
chromatin-binding  bromodomains,  an  extra-terminal  domain,  a 
bipartite  nuclear  localization  sequence,  and  a  significant  portion  of 
NUT  coding  sequence.  As  such,  the  conserved  protein  structure  gave 
insight  into  the  mechanism  by  which  the  chimeric  protein  induces 
neoplastic  properties. 

Wild  type  NUT,  which  is  normally  only  expressed  in  the  testis  [99], 
contains  both  nuclear  localization  and  export  signal  sequences  and  is 
shuttled  between  the  nucleus  and  cytoplasm  via  a  leptomycin- 
sensitive  pathway  [106].  Importantly,  however,  the  Brd3-NUT  and 
Brd4-NUT  proteins  are  retained  in  the  nucleus,  suggesting  that 
interactions  between  the  Brd3  or  Brd4  bromodomains  and  chromatin 
are  essential  to  the  fusion  protein  [106]  (Fig.  4).  Further  evidence  for 
this  hypothesis  comes  from  an  siRNA  experiment  in  which  knock¬ 
down  of  Brd-NUT  fusion  transcripts  in  NMC  cell  lines  resulted  in 
squamous  differentiation  and  cell  cycle  arrest  [106].  This  suggested 
that  the  nuclear  retention  of  NUT,  not  the  loss  of  the  Brd  C-terminal 
domain,  is  responsible  for  promoting  NMC  carcinogenesis  [106].  The 
realization  that  Brd-NUT  gene  fusions  define  a  class  of  translocations 
that  fuse  bromodomains  to  the  NUT  protein  suggests  that  oncogenic 
translocations  will  arise  from  multiple  partners  when  critical  domains 
are  present  in  more  than  one  gene. 


BRD4 


Fig.  4.  Nuclear  retention  of  NUT.  The  BRD4-NUT gene  fusion  represents  a  third  class  of  rearrangements  in  which  the  resulting  protein  gains  activity  to  become  a  proto-oncogene.  In 
this  case,  the  two  bromodomains  of  BRD4  are  fused  to  NUT.  Although  NUT  usually  cycles  between  tbe  nucleus  and  cytoplasm  in  a  highly  controlled  manner,  appendage  of  the  BRD4 
bromodomains  to  the  majority  of  the  NUT  protein  lead  to  nuclear  retention  of  the  protein  and  aberrant  activity. 
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2.7.  ETV6-NTRK3 

The  first  major  example  of  a  recurrent  epithelial  rearrangement 
that  appeared  not  only  in  multiple  tumor  types,  but  had  also  been 
reported  in  a  large  subset  of  hematologic  malignancies  was  detected  in 
several  cases  of  secretory  breast  carcinoma,  a  rare  subtype  of 
infiltrating  ductal  carcinoma  affecting  both  children  and  adults  [107], 
Tognon  et  al.  detected  the  ETV6-NTRI<3  fusion  by  comprehensive  FISH 
analysis  in  92%  (12  of  13)  secretory  breast  carcinoma  cases  [108].£7V6 
(also  TEL)  is  an  ETS  family  member  that  is  involved  in  a  large  number 
of  fusions  to  either  a  transcription  factor  like  AML!  [109]  or  to  a  protein 
tyrosine  kinase  domain  like  that  of  ABL  [110,111  ],y/lK2  [112-114[,71RG 
[115,116],  PDGERfi  [117]  or  FGFR3  [118],  each  of  which  define  a  unique 
leukemia  subtype  (reviewed  in  [119]).  ETV6  contains  a  pointed 
oligomerization  domain  (PNT;  also  known  as  sterile  alpha  motif, 
SAM,  or  helix-loop-helix,  HLH)  and  an  ETS  DNA  binding  domain,  the 
expression  of  which  is  required  for  developmental  processes  such  as 
hematopoiesis  and  yolk  sac  angiogenesis  [120],  NTRK3  is  a  transmem¬ 
brane  neurotrophin-3  surface  receptor  that  contains  a  c-terminal 
protein  tyrosine  kinase  domain  and  plays  a  role  in  growth,  develop¬ 
ment,  and  cell  survival  of  neural  cells  in  the  central  nervous  system 
(reviewed  in  [121]).  The  fusion  of  the  N-terminal  ETV6  pointed 
domain  to  the  C-terminal  tyrosine  kinase  domain  of  NTRK3  was  first 
reported  in  congenital  fibrosarcoma  (CES)  [122],  but  has  since  been 
reported  in  multiple  cell  lineages  including  those  that  give  rise  to 
congenital  mesoblastic  nephroma  (CMN),  acute  myelogenous  leuke¬ 
mia,  and  secretory  breast  carcinoma  [108]  (reviewed  in  [123]). 

Eollowing  the  initial  discovery,  research  focused  on  the  transforming 
ability  of  the  recombination  product.  By  using  retroviral  gene  delivery 
methods,  the  ETV6-NTRK3  fusion  gene  was  shown  to  be  sufficient  to 
induce  the  non-tumorigenic  murine  breast  cell  lines  Eph4  (epithelial) 
and  Scg6  (myoepithelial)  as  well  as  N1H-3T3  fibroblasts  to  form  tumors, 
glandular  structures  and  to  express  epithelial  antigens  [108[.  This 
discovery  suggested  that  the  fusion  gene  acts  as  a  dominant  oncogene  in 
secretory  breast  cancer.  ETV6-NTRI<3  was  also  shown  to  inhibit  TGF-p 
tumor  suppressor  activity  in  N1H3T3  cells  [124],  suggesting  that  it  most 
likely  regulates  microRNA  biogenesis  indirectly  [125],  but  this  has  not 
yet  been  explored.  Although  it  is  known  that  adults  have  a  less  favorable 
prognosis  than  children  and  distant  metastases  are  rare  [126],  local 
recurrences  and  nodal  metastases  have  been  observed  [127]  suggesting 
that  the  gene  fusion  leads  to  an  invasion-associated  transcriptional 
program,  but  this  also  has  not  been  explored.  Despite  this,  it  is  known 
that  constitutive  activation  of  the  fusion  protein  leads  to  activation  of 
the  Ras-mitogen-activated  protein  kinase  (MARK)  pathway  and  the 
phosphoinositide-3-kinase  (P131<)-A1<T  pathway,  the  mechanism  lead¬ 
ing  to  activation  of  these  pathways  has  remained  elusive  until  recently, 
when  the  fusion  protein  was  shown  to  associate  with  c-Src  by 
immunoprecipitation  from  fusion-positive  CES  and  CMN  human 
primary  tumors  [128].  More  recently,  however,  a  mouse  knockin 
model  was  created  by  introducing  the  human  NTRK3  cDNA  into  exon  6 
of  the  mouse  ETV6  locus,  which  induced  a  fully  penetrant,  multifocal 
breast  cancer  [  129].  By  using  microarray  analysis  of  unsorted  and  sorted 
tumors  from  this  model,  as  well  as  N1H3T3  cells  transduced  with  the 
fusion  gene,  the  authors  showed  that  ETV6-NTRK3  enriches  for  WNT 
target  genes  through  activation  of  the  API  complex  [129].  The 
requirement  for  API  activity  in  £7V6-NrR/Q-mediated  transformation 
was  confirmed  by  showing  that  the  co-expression  of  a  dominant 
negative  component  of  API  complex,  c-JUN  TAM67,  with  the  gene 
fusion  blocked  tumorigenic  properties  both  in  vitro  and  in  vivo  [129]. 
The  ETV6-NTRK3  gene  fusion  represents  one  of  the  last  gene  fusions  to 
be  discovered  by  traditional  biological  techniques. 

2.8.  TMPRSS2-ETS 

In  2005,  advances  in  bioinformatics  led  to  the  discovery  of 
rearrangements  on  chromosome  21  between  TMPRSS2  (transmem¬ 


brane  protease,  serine  2)  and  ERG  (v-ets  erythroblastosis  virus  E26 
oncogene  homolog  (avian))  resulting  in  the  TMPRSS2-ERG  gene 
fusion.  Thus  far,  genomic  rearrangements  leading  to  an  ERG  gene 
fusion  have  been  reported  in  approximately  50%  of  clinically  localized 
prostate  cancers  published  (reviewed  in  [130]).  TMPRSS2  is  a  prostate- 
specific,  androgen-regulated  gene  [131-133]  that  has  two  annotated 
transcription  variants,  both  of  which  are  involved  in  the  fusion  with 
ERG;  the  annotated  TMPRSS2  in  about  50%  of  the  gene  fusions,  an 
alternative  TMPRSS2  variant  in  10%  of  gene  fusions,  and  both  variants 
in  slightly  more  than  40%  of  analyzed  gene  fusions  [134].  ERG  belongs 
to  the  ETS  family  of  transcription  factors  and  has  two  transcription 
variants  that  differ  only  slightly  in  the  5'-UTR  (deleted  in  the  gene 
fusion)  and  in  the  usage  of  an  in-frame  exon,  the  role  of  which  remains 
undefined.  The  most  common  TMPRSS2-ERG  gene  fusion  variants 
involve  TMPRSS2  exon  1  or  2  fused  to  ERG  exon  2,  3,  4,  or  5  [134-143] 
and  less  frequently  rearrangements  of  TMPRSS2  exon  4  or  5  fused  to 
ERG  exon  4  or  5  [141].  In  line  with  the  combinatorial  complexity  of 
TMPRSS2-ERG  rearrangements,  different  fusions  have  correlated 
with  slightly  different  phenotypic  outcomes.  For  example,  TMPRSS2 
exon  2  fused  with  ERG  exon  4  is  associated  with  aggressive  disease, 
while  others  have  been  associated  with  seminal  vesicle  invasion  and 
poor  outcome  [143]. 

Like  TMPRSS2,  the  TMPRSS2-ERG  gene  fusion  is  androgen-regulated 
in  an  androgen-responsive  cell  line  (VCAP)  carrying  the  rearrange¬ 
ment  [135],  but  not  in  an  androgen-insensitive  cell  line  harboring  the 
fusion  (NCI-H660)  [144].  We  have  shown  that  VCaP  cells  and  benign 
prostate  cells  forced  to  overexpress  ERG  drive  components  of  the 
plasminogen  activation  pathway  to  mediate  cellular  invasion  using 
transwell  migration  assays  [145].  We  have  also  reported  that  primary 
or  immortalized  benign  prostate  epithelial  cells  overexpressing  ERG 
have  a  transcriptional  program  with  high  levels  of  several  invasion- 
associated  genes,  but  did  not  display  phenotypic  increases  in  cellular 
proliferation  or  anchorage-independent  growth  [145].  Despite  this, 
one  group  recently  identified  c-MYC  as  a  downstream  target  of  ERG 
and  demonstrated  that  ERG  knockdown  in  TMPRSS2-ERG  expressing 
CaP  cells  resulted  in  loss  of  cell  growth  in  vitro  and  loss  of 
tumorgenicity  in  vivo,  with  only  22%  (2/9)  mice  developing  detectable 
tumors  at  day  42  in  siRNA  treated  cells  as  compared  to  100%  (5/5)  in 
the  control  group  [146].  Interestingly,  transgenic  mice  expressing  an 
androgen-regulated  ERG  fusion  gene  develop  mouse  prostatic  intrae¬ 
pithelial  neoplasia  (PIN),  a  precursor  lesion  of  prostate  cancer,  not 
prostate  cancer.  Taken  together  with  our  in  vitro  data,  these  results 
suggest  that,  without  secondary  molecular  lesions  such  as  loss  of  the 
tumor  suppressors  PTEN  or  NKX3-L  the  TMPRSS2-ERG  gene  fusion 
may  not  be  sufficient  for  transformation  [145,147,148]. 

Although  ERG  clearly  participates  in  the  majority  of  ETS  family 
gene  fusions  in  prostate  cancer,  other  ETS  family  members  including 
ETVl  [135],  ETV4  [149,150]  and  ETV5  [151]  also  contribute  to  gene 
fusions  in  prostate  cancer,  albeit  at  a  much  lower  frequency.  In 
contrast  to  TMPRSS2,  which  is  the  only  known  5'  partner  to  ERG,  the 
other  ETS  family  members  may  have  a  variety  of  5'  partners  including 
those  with  androgen-responsive  promoters  (TMPRSS2,  SLC45A3,  KLK2, 
FIERV-K_22ql].23  and  CANTl),  one  with  an  androgen-insensitive 
promoter,  but  a  constitutively  active  promoter  {FINRPA2B1),  and  one 
with  an  androgen-repressed  promoter  (C15orj2I)  [135,149,151-153]. 
As  in  the  case  of  ERG,  forced  expression  of  ETVl  under  the  control  of  a 
CMV  promoter  did  not  enhance  cell  proliferation  in  benign  prostate 
epithelial  cell  lines  and  did  not  lead  to  anchorage-independent  colony 
formation  in  soft  agar,  but  did  lead  to  the  enrichment  of  genes 
associated  with  invasion  [145].  Consequently,  knockdown  of  ETVl  in 
LNCAP  cells  prevented  transwell  invasion  through  matrigel  [145,154]. 

2.9.  EML4-ALK 

Recently,  Soda  et  al.  reported  a  retrovlral-mediated  transformation 
screen,  in  which  they  created  a  cDNA  expression  library  from  a 
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surgically  resected  lung  adenocarcinoma  [155],  Following  transforma¬ 
tion  of  N1H3T3  cells,  cDNAs  were  recovered  from  cells  by  PCR 
amplification  and  sequenced.  One  of  these  sequenced  transcripts 
contained  a  fusion  between  EML4  (echinoderm  microtubule-asso¬ 
ciated  protein-like  4)  and  ALK  (anaplastic  lymphoma  kinase)  that  was 
later  confirmed  as  an  inversion  of  chromosome  2p  in  6.7%  (5  of  75) 
NSCLC  patients  [155].  Wild  type  EML4  is  a  member  of  the  EMAP  family 
of  proteins  and  the  amino-terminus  (amino  acids  1-249)  were 
previously  demonstrated  to  be  essential  for  microtubule  formation 
in  HeLa  cells  [156].  ALK  encodes  a  tyrosine  kinase  and  a  MAM  domain 
(a  domain  frequently  found  on  the  extracellular  side  of  the  membrane 
on  many  receptors).  Despite  the  apparent  low  frequency  EML4-AL1< 
gene  fusions  in  NSCLC,  the  transforming  ability  of  EML4-AL1<  gene 
fusion  variant  1,  2,  and  3b,  but  not  a  kinase  inactive  mutant  (1<589M) 
has  been  demonstrated  by  engrafting  N1H-3T3  cells  infected  with 
retroviral  expression  vectors  and  showing  that  tumors  arise  in  8/8 
mice  from  all  groups  except  for  the  kinase  dead  mutant  [157]. 

To  corroborate  the  low  frequency  EML4-ALK  rearrangements  in 
NSCLC,  careful  PCR-based  analysis  was  completed  on  NSCLC  cases  to 
identify  novel  in-frame  EML4-ALK  gene  fusions  that  led  to  the 
identification  of  two  novel  fusion  isoforms  called  variant  3a  and  3b 
[157].  Even  more  recently,  analysis  of  a  cohort  of  253  lung 
adenocarcinoma  patient  samples  identified  two  new  EML4-ALK 
fusions  in  which  either  exon  14  or  exon  2  of  EML4  was  fused  to  Exon 
20  of  ALK  (variants  4  and  5,  respectively),  however,  only  4.35%  of 
patients  were  found  to  express  any  of  the  5  known  EML4-ALK 
genomic  rearrangements  [158].  A  similarly  low  rate  of  the  ELM4-ALK 
fusion  was  reported  in  a  study  of  104  lung  cancer  surgical  specimens 
with  only  one  fusion-positive  case  [159]  and,  in  a  study  of  different 
lung  cancers,  the  fusion  was  identified  in  3.4%  (5  of  149)  adenocarci¬ 
nomas,  but  not  in  48  squamous  cell  carcinomas,  3  large-cell 
neuroendocrine  carcinomas,  or  21  small-cell  carcinomas  [160]. 
However,  this  is  to  be  expected,  given  the  small  sample  size  from 
non-adenocarcinomas.  The  AL/C  gene  has  previously  been  identified  as 
the  3'  fusion  partner  of  N PM-  [161].  TPM3-  [162],  CLTC-  [163],  ATIC- 
[164-166]  and  TEG-  [167].  In  light  of  this  observation,  RT-PCR  analysis 
was  used  to  screen  all  known  hematologic  ALK  fusion  partners  in  a 
cohort  of  77  NSCLC  samples,  however,  no  redundant  fusion  partners 
were  identified  and  only  2,6%  (2  of  77)  of  NSCLC  cases  harbored  the 
EML4-ALK  fusion  [168].  To  supplement  the  existing  RT-PCR  data  in  the 
literature,  our  group  developed  a  break-apart  EISH  assay  to  analyze 
ELM4-ALK  fusion  as  well  as  the  amplification  of  each  gene.  We 
reported  the  fusion  occurred  in  less  than  3%  of  NSCLC  cases  analyzed, 
and  that,  in  most  cases  harboring  the  lesion,  not  all  cells  exhibited  the 
fusion.  We  also  found  that  EML4  and/or  ALK  amplification  occurred, 
indicating  that  other  mechanisms  of  genomic  rearrangement  leading 
to  amplification  may  arise  [169]. 

2.10.  SLC34A2-ROS 

In  2007,  a  survey  of  phosphotyrosine  signaling  in  lung  cancer  not 
only  led  to  the  re-identification  of  the  EML4-ALK  fusion,  but  also  the 
discovery  of  a  novel  translocation  between  chromosomes  4pl5  and 
6q22,  in  which  the  transmembrane  domain  containing  N-terminal 
region  of  the  solute  carrier  family  32,  member  2  (^SLC34A2)  is  fused  to 
an  N-terminal  transmembrane  domain  of  the  c-ros  oncogenes  1 
(ROS),  respectively,  in  the  lung  cell  line  HCC78  [170].  SLC34A2  is 
encoded  from  a  single  transcription  variant  and  ROS,  which  is  a  type  1 
integral  membrane-bound  tyrosine  kinase  and  is  a  known  oncogene 
that  is  highly  expressed  in  several  tumor  cell  lines,  and  also  encoded 
from  a  single  transcript.  Interestingly,  while  the  authors  did  not 
identify  SLC34A2  rearrangements  with  ROS  in  patient  samples,  a  gene 
fusion  between  CD74,  located  at  5q32,  and  ROS  was  observed,  in 
which  the  tandem  transmembrane  domain  structure  was  again 
observed  [170].  This  suggests  not  only  that  ROS  is  another  promiscu¬ 
ous  gene  fusion  partner,  but  the  tandem  transmembrane  structure  is 


one  mechanism  leading  to  constitutive  activation  of  the  tyrosine 
kinase.  Indeed,  forced  expression  of  the  SLC34A2-ROS  chimera 
demonstrated  constitutive  kinase  activity  in  the  cellular  membrane 
fraction  [170]. 

2.31.  SLC45A3-ELK4 

With  the  recent  advent  of  next  generation  sequencing  technology 
(described  below),  our  group  has  recently  identified  another 
recurrent  gene  fusion  in  prostate  cancer  [171].  Using  this  technology 
we  identified  the  fusion  of  SLC45A3  to  ELK4,  an  ETS  family  member. 
Here  exon  4  of  SLC45A3  is  fused  to  exon  1  of  ELK4.  Interestingly,  this 
novel  gene  fusion  was  identified  from  the  RNA  of  a  cell  line  harboring  a 
known  gene  fusion  involving  another  ETS  family  member  gene,  ETV3. 
Likewise  this  novel  gene  fusion  involves  SLC45A3,  which  is  known  to 
fuse  with  ETVl  in  other  prostate  cancer  cases.  Unlike  other  gene 
fusions  described  to  this  point,  SLC45A3-ELK4  seems  to  result  from 
polymerase  read-through  and  intergenic  splicing  rather  than  genomic 
rearrangement  as  no  detectable  alterations  were  detected  on  the  DNA 
level  by  fluorescence  in  situ  hybridization  (FISH),  array  comparative 
hybridization  (aCGH)  or  high-density  single  nucleotide  polymorph¬ 
ism  (SNP)  arrays  [171].  RNA  level  gene  fusions  were  recently 
identified  in  endometrial  stromal  tumors  and  are  discussed  below. 

3.  Lessons  from  IVILL  translocations 

While  the  list  of  epithelial  derived  gene  fusions  continues  to 
expand,  it  is  important  to  highlight  unique  mechanisms  of  oncogene 
formation  through  specific  genomic  rearrangements  from  the  hema¬ 
tological  malignancies.  Translocations  altering  the  mixed-lineage 
leukemia  (MLL)  gene  on  llq23  frequently  lead  to  fusions  with  over 
40  different  genes  on  different  chromosomes  with  MLL-AF4  and  MLL- 
AF9  among  the  most  frequent  chimeras  (reviewed  in  [172,173]). 
Interestingly,  different  MLL  fusions  are  highly  associated  with  either 
acute  myeloid  leukemia  (AML)  or  acute  lymphoid  leukemia  (ALL, 
depending  on  the  fusion  partner  [174].  MLL  is  the  mammalian 
homologue  of  a  Drosophila  gene  called  trithorax,  which  was  shown 
to  play  a  critical  role  in  axial  morphogenesis  and  patterning  during 
embryogenesis  through  the  regulation  of  HOX  genes  {HOM-C  in  Dro¬ 
sophila)  [175,176].  Multiple  studies  have  suggested  that  deregulation 
of  HOX  gene  expression  contributes  to  leukemogenesis  [177]. 
Additionally,  retroviral  transduction  of  a  MLL  fusion  gene  construct 
was  able  to  transform  wild  type,  but  not  the  Hoxa9-deflcient,  bone 
marrow  cells  providing  direct  evidence  that  specific  HOX  gene 
expression  may  be  required  for  leukemogenesis  [178],  Because  MLL 
chimeras  often  lose  large  fragments  and  different  domains  from  either 
the  N-  or  C-terminal  regions,  the  seemingly  critical  role  of  MLL- 
associated  HOX  gene  expression  to  leukemogenesis  led  to  the  question 
of  whether  the  molecular  mechanisms  by  which  wild  type  MLL 
regulates  gene  expression  are  mutually  exclusive  from  those 
employed  by  MLL  chimeras  [179]. 

As  the  molecular  mechanisms  of  MLL  target  gene  regulation 
continue  to  unravel,  several  studies  have  shed  light  on  the  fact  that 
molecular  function  between  wild  type  and  fusion  gene  settings  may 
be  unique,  though  the  outcome  of  gene  activity  is  ultimately  similar. 
Wild  type  MLL  encodes  a  multi-domain  protein  with  three  AT-hooks 
used  for  binding  AT-rich  DNA  sequences  and  a  histone  methyltrans- 
ferase  domain  [180]  and  assembles  into  supercomplexes  containing 
several  different  chromatin  remodeling  enzymes  on  target  DNA  motifs 
like  those  found  in  HOX  genes  [181].  Chimeric  MLL  proteins,  on  the 
other  hand,  appear  to  utilize  different  mechanisms  to  modulate  HOX 
gene  expression  and  initiate  leukemogenesis.  For  example,  fusion  of 
coiled-coil  domains  from  GAS7  or  AFl  p  to  MLL  endow  the  chimeric 
protein  with  the  ability  to  dimerize  on  the  target  gene  promoters  and 
have  been  suggested  to  stimulate  transcription  through  the  inap¬ 
propriate  recruitment  of  members  of  the  MLL  supercomplex  [182]. 
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This  suggested  that  preventing  dimerization  of  the  coiled-coil 
domains  with  targeted  small  molecules  could  inhibit  MIL  activity  in 
this  subset  of  MLL  fusions.  In  contrast,  some  MLL  fusions  lead  to 
constitutive  nuclear  retention  while  maintaining  similar  binding 
patterns  as  the  dimerizable  MLL  chimeras  on  the  HoxA9  locus  [183], 
In  the  absence  of  a  partner  gene,  MLL  can  acquire  an  in-frame  partial 
tandem  duplication  (PTD)  of  exons  5  through  If  (occurring  in 
approximately  4%-7%  of  AML  cases)  that  causes  overexpression  of 
HoxA7,  HoxA9,  and  HoxAlO  in  spleen,  BM,  and  blood  in  a  knockin 
mouse  model  [184].  As  such,  altering  downstream  HOX  gene 
expression  appears  to  be  one  critical  role  of  MLL  gene  fusions  and 
rearrangements. 

Given  that  wild  type  and  chimeric  MLL  proteins  appear  to 
accomplish  at  least  one  similar  molecular  function  (HOX  gene 
regulation),  the  question  of  how  epithelial  gene  fusions  will  function 
in  comparison  to  their  wild  type  counterparts  remains  intriguing.  For 
example,  we  have  very  little  understanding  of  the  normal  molecular 
mechanisms  utilized  by  ERG  and  ETVl  to  control  gene  expression 
(prostate  cancer  gene  fusions,  discussed  above),  let  alone  the  critical 
co-factors  required  for  transcriptional  regulation.  Although  we  may 
expect  the  molecular  mechanisms  of  ERG  and  ETVl  mediated  gene 
regulation  to  be  the  same  in  the  wild  type  and  fusion  settings  (because 
the  encoded  proteins  are  nearly  identical),  this  remains  to  be  proven. 
Perhaps  the  ability  to  design  rational  drug  targets  against  specific 
fusion  proteins  without  obvious  molecular  susceptibilities  (like  the 
tyrosine  kinase  activity  of  BCR-ABL)  will  depend  as  much  on  our 
understanding  of  each  fusion  protein's  function  and  critical  co-factors 
as  on  their  downstream  targets. 

4.  Difficulty  in  identifying  epitheiiaf  cancer  gene  fusions 

With  the  discovery  of  the  TMPRSS2-ERG  gene  fusion  in  prostate 
cancer,  we  look  back  on  the  history  of  cancer  biology  and  wonder  why 
gene  fusions  have  not  been  identified  in  some  of  the  most  well- 
studied  epithelial  cancers?  Part  of  the  problem  was  methodological,  as 
the  chromosome  quality  in  epithelial  neoplasms  is  very  poor  when 
compared  to  hematologic  neoplasms.  However,  cytogenetic  techni¬ 
ques  have  improved  dramatically  since  the  discovery  of  the  “minute” 
chromosome  in  1960  [6],  In  fact,  in  the  1960s,  chromosome  patterns  in 
epithelial  tumors  were  already  being  described  as  abnormal  [185]  and 
it  was  often  thought  that  the  degree  of  cytogenetic  changes 
corresponded  proportionally  with  clinical  progression  [186],  making 
the  identification  of  individual  and  recurrent  translocations  difficult. 
In  fact,  the  idea  that  the  induction  of  genomic  instability  is  a  critical 
and  intended  step  in  the  malignant  progression  of  solid  tumors  has 
gained  considerable  momentum  [187,188].  Recently,  it  was  demon¬ 
strated  that  overexpression  of  Separase,  a  protein  that  is  over¬ 
expressed  in  a  subset  of  breast  cancers,  leads  to  can  induce 
chromosome  instability  and  aneuploidy  in  the  mutant  p53  mouse 
mammaiy  epithelial  cell  line  FSI<3  [189].  Likewise,  deregulation  of 
Mad2,  which  regulates  separase  activity,  has  been  shown  to  promote 
chromosomal  instability,  induce  aneuploidy  and  lead  to  tumorigenesis 
[190].  Interestingly,  once  Mad2-induced  neoplastic  transformation 
has  occurred,  Sotillo  et.  al.  demonstrated  that  expression  of  Mad2  is  no 
longer  required  for  tumor  progression  suggesting  that  the  induction  of 
chromosomal  instability  could  be  a  transient  event  in  oncogenesis 
[190].  In  fact,  it  is  possible  that  specific  gene  fusions  induce  genomic 
instability  through  deregulation  of  normal  mitotic  events  like 
separase  or  Mad2  activity  or  through  novel  mechanisms  yet  to  be 
described.  If  induction  of  chromosomal  instability  was  a  mechanism  of 
oncogenesis  employed  by  a  specific  gene  fusion,  then  induction  of 
other  secondary  “carrier”  chromosomal  rearrangements  would  simply 
serve  to  mask  the  identification  of  the  recurrent  genetic  rearrange¬ 
ment.  Such  a  progression  pattern  in  epithelial  tumors  could  explain 
the  complex  heterogeneity  often  observed  in  such  malignancies 
(Fig.  5).  In  contrast,  leukemias,  lymphomas  and  mesenchymal  tumors 
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Fig.  5.  Difficulty  in  discovering  gene  fusions.  One  possibility  is  that  a  critical  function  of 
oncogenes  in  epithelial  cancers  is  to  alter  genomic  structure  and  it  has  been  suggested 
that  such  changes  could  lead  to  cancer  progression.  However,  if  such  a  model  were  true, 
it  would  give  a  reason  for  the  genomic  heterogeneity  observed  in  epithelial  cancers  that 
has  allowed  recurrent  gene  fusions  to  go  unnoticed  in  solid  tumors. 

are  almost  95%  clonal  [191  [.  As  such,  the  complexity  and  shear  number 
of  genomic  rearrangements  in  epithelial  malignancies  has  led  to 
difficulty  in  defining  primary  aberrations  in  these  neoplasms.  This 
difficulty  eventually  led  to  the  incorrect  notion  that  genomic 
rearrangements  leading  to  gene  fusions  were  simply  less  common 
in  epithelial  tumors. 

5.  Mitelman  hypothesis 

In  order  to  address  this  notion  that  fusion  genes  are  almost 
exclusively  a  hematologic  phenomena,  Mitelman  et  al.  completed  a 
comprehensive  study  of  all  known  cytogenetically  abnormal  neo¬ 
plasms  reported  in  the  literature  [192].  Importantly,  data  published 
by  the  group  supported  the  counter-hypothesis  that,  in  every  tumor 
type,  the  numbers  of  recurrent  balanced  chromosome  abnormalities, 
gene  fusions  and  balanced  rearrangements  are  a  function  of  the  total 
number  of  analyzed  cases  [192],  In  this  study,  271  gene  fusions  and 
59  potential  gene  fusions  (only  one  gene  identified  at  the  break¬ 
point)  were  catalogued,  of  which  275  unique  genes  were  involved  in 
the  rearrangements  [192[.  This  indicated  that  a  substantial  number 
of  genes  were  present  in  more  than  one  chimeric  transcript  (e.g., 
MLL,  ETV6  and  RET  as  described  above).  In  classifying  each  gene 
fusion  by  the  class  to  which  each  member  of  the  chimera  belonged, 
the  group  demonstrated  that  the  proportion  of  fusions  belonging  to 
each  class  was  approximately  equal  in  both  hematologic  and  solid 
tumor  malignancies,  with  the  transcription  factor  class  accounting 
for  38-44%  and  tyrosine  kinase  class  tabulating  5-7%  [192[.  This 
study  suggested  that  the  occurrence  of  gene  fusions  is  a  general 
molecular  event  that  has  no  fundamental  tissue-specific  differences. 
However,  gene  rearrangements  must  at  least  encourage  function  in 
specific  genetic  backgrounds  such  as  the  TMPRSS2-ERG  fusion, 
which  requires  active  androgen  signaling,  and  thus  encourages 
prostate  specificity. 

6.  Tissue-specific  gene  fusions 

The  idea  that  genomic  rearrangements  are  tissue-specific  is  an 
emerging  concept  in  the  field  of  gene  fusion  biology.  For  example, 
TMPRSS2  is  a  strongly  androgen-regulated  and  prostate-specific  gene 
that  is  fused  to  the  ETS  family  members  ERG  and  ETVl  in  prostate 
cancer  [135],  While  other  ETS  family  members  form  fusion  genes 
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Table  1 

Chromosomal  rearrangements  in  epithelial  cancers. 


Malignancy 

Gene  fusion 

Chromosome  rearrangement 

Method  of  discovery 

Study 

Ref. 

Follicular  thyroid  carcinoma 

PAX8-PPAR7 

t(2;3)(ql3;p25) 

Primary  tumor  karyotypic  analysis/FlSH/3'  RACE 

Kroll  et  al. 

[90] 

Midline  carcinoma 

BRD3-NUT 

t(9;15)(q34;ql4) 

Candidate  gene  FISH  Screen 

French  et  al. 

[106] 

BRD4-NUT 

t(15;19)(ql4;pl3) 

Primary  tumor  karyotypic  analysis/FlSH/southern  blot 

French  et  al. 

198] 

Non-small-cell  lung  cancer 

EML4-ALK 

inv(2p) 

Transformation  assay/direct  sequencing 

Soda  et  al. 

1155] 

TFG-ALK 

t(2;3)(p23;qt2) 

Tyrosine  Kinase  Activity  Screen/ 5'  RACE 

Rikova  et  al. 

1170] 

SLC34A2-ROS 

t(4;6)(pl5:q22) 

Papillary  renal  cell  carcinoma 

PRCC-TFE3 

t(X:t){pll;q23) 

Primary  tumor  karyotypic  analysis/southern  blot/5'  RACE 

Sidhar  et  al. 

[69] 

Papillary  thyroid  carcinoma 

RET-NTRKl 

t(l;10)(q21:qll) 

Transformation  assay/direct  sequencing 

Martin-Zanca  et  al. 

[24] 

Pleomorphic  adenoma 

CITNBl-PLAGl 

t(3;8)(p21;ql2) 

Primary  tumor  karyotypic  analysis/ 

Breakpoint  mapping/ southern  blot/5'  RACE 

Kas  et  al. 

[59] 

HMGA2-FHIT 

t(3;12)(pl4;ql5) 

Primary  tumor  karyotypic  analysis/3'  RACE 

Geurts  et  al. 

180] 

HMGA2-NF1B 

t(9;12)(q24;ql5) 

Primary  tumor  karyotypic  analysis/3'  RACE 

Geurts  et  al. 

181] 

Prostate  cancer 

TMPRSS2-ERG 

del(21)(q22) 

COPA/Exon  walking/5'  RACE 

Tomlins  et  al. 

[135] 

TMPRSS2-ETV1 

t(7;21)(p21;q22) 

TMPRSS2-ETV4 

t(17:21)(q21:q22) 

Tomlins  et  al. 

[149] 

TMPRSS2-ETV5 

t(3;21)(p28;q22) 

Helgeson  et  al. 

[151] 

SLC45A3-EL1<4 

del(l)(q32) 

Integrated  high-throughput  sequencing 

Maher  et  al. 

1171] 

DDX5-ETV4 

t(17)(q24;q21) 

Candidate  gene  FISH  Screen/ 5'  RACE 

Han  et  al. 

1150] 

Secretory  breast  carcinoma 

ETV6-NTRK3 

t(t2;15)(ql3;q25) 

Primary  tumor  karyotypic  analysis/FlSH 

Tognon  et  al. 

1108] 

that  give  rise  to  other  malignancies,  chimeras  between  androgen- 
regulated  genes  and  ETS  genes  have  only  been  observed  in  prostate 
cancer  [130],  Likewise,  the  ALK  tyrosine  kinase  is  frequently  fused  to 
multiple  partners  in  hematopoietic  (myelogenous  leukemia), 
mesenchymal  (congenital  fibrosarcoma)  and  epithelial  (secretory 
breast  carcinoma)  malignancies,  but  no  redundant  fusion  partners 
have  been  identified  across  tissue  types  [159],  Retention  of  the  TFE3 
DNA  binding  domain  in  follicular  thyroid  carcinoma  is  another 
example  of  this,  as  TFE3  is  a  thyroid-specific  transcription  factor 
[93],  Importantly,  little  is  understood  about  the  molecular  mechan¬ 
isms  leading  to  gene  rearrangement  and  the  underlying  reasons  that 
particular  chimeras  are  formed  recurrently.  The  idea  that  tissue- 
specific  rearrangements  occur  by  fusing  highly  transcribed  genes 
holds  promise  and  would  at  least  partially  explain  the  apparent  tissue 
specificity  observed  in  the  formation  of  chimeric  transcripts  even 
between  genes  that  are  fused  in  multiple  cancer  types. 

The  idea  that  gene  fusions  are  tissue-specific  could  have  profound 
implications  on  the  discoveiy  of  novel  gene  fusions.  Clearly,  however, 
gene  fusions  do  not  always  confer  tissue  specificity.  HMGA2  has  a  3'- 
UTR  that  is  negatively  regulated  by  the  Let?  microRNA  and  simply 
replaces  its  3'-UTR  through  rearrangement  with  another  gene 
(described  above),  therefore  representing  a  gene  fusion  that  most 
likely  retains  functionality  in  multiple  tissue  types.  As  such,  while  this 
concept  may  have  its  largest  impact  on  underlying  molecular 
mechanisms  of  newly  discovered  gene  fusions,  it  will  probably  not 
alter  the  rate  gene  fusion  discovery. 

7.  Discovery  of  novel  gene  fusions 

Although  the  rate  recurrent  chromosomal  rearrangement  discov¬ 
ery  in  epithelial  tumors  has  been  modest,  the  recent  discovery  of  gene 
fusions  in  prostate  cancer  has  led  to  a  renewed  interest  in  gene  fusions 
identification  in  other  epithelial  cancer  subtypes.  Perhaps  the  best 
explanation  for  the  sudden  increase  in  the  characterization  of 
recurrent  gene  fusions  is  the  advent  of  novel  technologies  (Table  f ). 
For  example,  the  use  of  existing  gene-expression  data  in  the  discovery 
of  novel  gene  fusions  was  limited  until  the  emergence  of  cancer 
outlier  profile  analysis  (COPA),  which  ranks  genes  by  normalizing 
expression  values  based  on  median  absolute  deviation  of  gene 
expression  to  accentuate  outlier  profiles  (reviewed  in  [130]).  When 
COPA  was  applied  to  gene-expression  datasets  in  the  Oncomine 
database  [193-196],  the  analysis  was  able  to  identify  several  hallmark 
cancer  related  genes  and  led  to  the  discovery  of  the  ERG  and  ETVl 
outlier  profiles  in  prostate  cancer  [135].  Subsequent  exon-walking 
quantitative  PCR  was  used  to  demonstrate  loss  of  the  5'  exons  in  both 


ERG  and  ETVl,  giving  rise  to  the  notion  that  a  gene  fusion  event  was 
responsible  for  the  outlier  expression  of  these  genes  in  prostate 
cancer.  Finally,  5'-RNA  ligase-mediated  rapid  amplification  of  cDNA 
ends  (5'-RACE)  was  used  to  identify  the  5'  untranslated  region  of 
TMPRSS2,  a  prostate-specific,  androgen-regulated,  transmembrane 
serine  protease  gene  [131,132,197].  Fusion  specific  PCR  and  fluores¬ 
cence  in  situ  hybridization  (FISH)  were  used  to  confirm  the  genomic 
rearrangement. 

In  contrast  to  using  COPA  and  exon-walking  quantitative  PCR  to 
identify  fusion  gene  candidates,  several  labs  are  now  employing  next 
generation  sequence  methods  wherein  DNA  or  mRNA  can  be 
fragmented,  sequenced  and  mapped  to  the  genome  in  a  matter  of 
weeks  to  identify  gene  fusions.  Various  commercial  platforms  have 
been  developed  with  the  intent  of  sequencing  as  much  of  the  genome 
or  transcriptome  as  possible  and  are  classified  based  on  the  length  of 
the  templates  each  platform  sequences.  Long  read  technologies,  like 
454,  can  sequence  long  templates  (>f  kb)  whereas  short  read 
technologies,  like  SOLEXA  and  SOLID,  are  currently  capable  of 
sequencing  35-50  nucleotide  templates.  At  first  glance,  long  read 
technologies  may  appear  to  have  the  advantage  of  making  genome  (or 
transcriptome)  re-assembly  much  simpler  than  short  read  technolo¬ 
gies.  However,  a  major  advantage  of  short  read  technologies  is  the 
depth  of  coverage,  or  the  number  of  times  a  segment  of  the  genome  is 
sequenced,  which  is  currently  much  higher  for  short  read  than  long 
read  technologies.  As  such,  the  choice  of  technology  is  still  dependent 
on  the  scientific  question. 

If  our  question  is  to  identify  the  best  method  for  novel  fusion  gene 
discovery,  we  assume  that  sequencing  the  transcriptome  space  will  be 
much  efficient  than  sequencing  cancer  genomes.  In  theory,  the 
discovery  of  gene  fusions  by  long  read  technology  will  require 
sequencing  across  the  actual  gene  fusion  boundary  of  the  chimeric 
transcript.  In  contrast,  short  read  technologies  may  be  able  to  identify 
gene  fusions  by  two  different  methods.  The  first  and  most  straight 
forward  method  is  the  identification  of  sufficient  short  reads  that  do 
not  map  directly  to  the  transcriptome,  but  correspond  to  the  gene 
fusion  boundary;  and  these  short  reads  should  identify  both 
contributing  genes  with  high  probability.  Second,  because  transcripts 
are  thought  to  be  sequenced  with  a  uniform  distribution  across  the 
length  of  the  transcript,  except  for  at  the  extreme  5'  and  3'  ends,  exon 
expression  for  each  transcript  can  be  analyzed.  Genes  involved  in 
rearrangements,  leading  to  chimeric  transcripts,  would  be  expected  to 
lack  any  exon  expression  on  one  of  the  transcript  ends.  However,  this 
method  will  need  to  be  carefully  developed,  as  mapping  of  short  reads 
to  duplicated  sequences  (or  sequences  that  appear  more  than  one 
time  in  the  genome)  remains  challenging. 
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To  test  whether  short  or  long  read  technology  was  better  for  the 
discovery  of  recurrent  gene  fusions,  we  recently  sought  to  “re¬ 
discovered”  the  known  gene  fusions  BCR-ABLl  and  TMPRSS2-ERG  by 
sequencing  the  RNA  transcriptome  of  either  the  leukemia  cell  line 
1<562  or  the  prostate  cell  line  VCAP,  respectively,  with  both  short  and 
long  read  platforms  [171],  Initially  both  technologies  were  able  to 
identify  the  known  gene  fusion  from  the  sample,  but  were  also  able  to 
identify  several  other  candidate  gene  fusions.  For  example,  the 
lllumina  short  read  platform  nominated  428  candidates  from  the 
VCAP  cell  line  [171  ].  However,  most  of  these  candidates  were  likely  to 
result  from  either  trans-splicing  [198],  co-transcription  of  adjacent 
genes  followed  by  intergenic  splicing  [199[,  or  as  a  consequence  of  the 
sample  preparation  protocol.  In  order  to  reduce  the  list  of  potential 
candidate  genes,  we  intersected  the  results  of  the  two  platforms  to 
yield  a  much  more  condensed  list.  Indeed  by  integrating  the  short 
read  and  long  read  platforms  rather  than  constraining  the  analysis  to 
either  short  or  long  read  technology,  we  were  able  to  significantly 
reduce  the  percent  of  false  positive  gene  fusions  discovered  [171], 

in  the  future,  an  even  newer  adaptation  of  next  generation 
sequencing  will  likely  replace  the  current  reliance  on  both  short  and 
long  read  technologies  for  fusion  gene  discovery.  Paired  end 
sequencing  is  a  method  in  which  short  read  technology  is  used  to 
sequence  nucleotides  from  both  the  5'  and  3'  ends  of  200-300 
nucleotide  fragments  of  the  genome  (or  transcriptome).  By  sequen¬ 
cing  both  ends  of  a  fragmented  RNA,  paired  end  sequencing  enhances 
not  only  the  reliability  of  mapping  and  assembly,  but  also  maintains 
significant  sequencing  depth,  in  a  manner  similar  to  our  recent 
integration  of  short  and  long  read  platforms,  the  use  of  paired  end 
sequencing  technology  for  gene  fusion  discovery  should  first  be 
examined  by  comparing  the  ability  of  matched  mate-pairs  to  identify 
known  gene  fusions  from  control  samples.  With  paired  end  sequen¬ 
cing,  a  single  sample  preparation  and  individual  sequencing  run  will 
hopefully  provide  sufficient  coverage  for  gene  fusion  discovery  and 
these  improvements  as  well  as  other  advancements  in  modern 
sequencing  technologies  will  likewise  lead  to  a  dramatic  improve¬ 
ment  in  our  ability  to  identify  novel,  pathogenic  gene  fusions. 

8.  Lessons  from  the  JAZFl-jJAZl  chimera 

Advances  in  sequencing  technology  will  most  likely  lead  to  a  rapid 
increase  in  the  number  of  characterized  gene  fusions  over  the  next 
few  years.  However,  a  much  more  pertinent  question  may  address  the 
reasons  for  chromosomal  rearrangements  leading  to  gene  fusions. 
Could  fusion  transcripts  be  a  part  of  normal  cell  biology?  it  is  also 
plausible  that  tissue-specific  fusions  could  impart  growth  advantages 
that  allow  a  cell  to  survive  traumatic  stress.  Nonetheless,  while  the 
underlying  molecular  mechanisms  triggering  genomic  rearrangement 
are  still  unclear;  we  surmise  that  once  a  genomic  rearrangement 
occurs,  cells  harboring  favorable  gene  fusions  will  be  selected  over 
time. 

Insight  into  the  development  of  genomic  rearrangements  may 
come  from  fundamental  observations  made  following  the  study  of 
endometrial  stromal  (EMS)  tumors,  in  2001,  a  recurrent  translocation 
t(7;17)(pl5;q21)  was  demonstrated  to  occur  in  EMS  tumors  that  led 
to  expression  of  the  chimeric  JAZFl  /jJAZl  mRNA  transcript  [200], 
Although  the  mechanism  leading  to  this  rearrangement  remains 
unknown,  a  recent  study  demonstrated  that  trans-splicing  of  RNAs  in 
normal  human  endometrial  stromal  cells  can  lead  to  the  chimeric 
JAZFl  /JJAZl  RNA  and  protein  independent  of  chromosomal  rearran¬ 
gement  [201  ],  This  obseTOtion  suggests  that  certain  gene  fusions  may 
be  generated  by  trans-splicing  of  RNAs,  which  then  lead  to 
chromosomal  rearrangement  due  to  their  pro-neoplastic  nature. 
Interestingly,  the  group  also  demonstrated  that  the  RNA  trans-splicing 
event  leading  to  the  JAZFl /JJAZl  chimera  was  inhibited  at  high 
concentrations  of  either  estrogen  or  progesterone,  further  suggesting 
that  certain  RNA  fusions  may  occur  in  a  hormone-dependent  manner. 


The  question  of  whether  or  not  other  specific  gene  fusions  arise  due  to 
abnormal  exposure  to  specific  hormones  has  not  been  studied. 

9.  Conclusions 

A  limited  number  of  epithelial  gene  fusions  have  been  described 
and  the  quest  for  novel  recurrent  gene  fusions,  like  the  discovery  of 
TMPRSS2-ERG  gene  fusions  in  prostate  cancer,  may  provide  major 
advances  in  cancer  research  in  the  near  future.  Here,  we  have 
demonstrated  that  gene  fusions  lead  to  overexpression  or  constitutive 
activation  of  oncogenes  by  a  variety  of  unique  mechanisms  including 
fusion  of  housekeeping  or  tissue-specific  gene  promoters  to  onco¬ 
genes,  as  in  the  case  of  TMPRSS2  gene  promoter  and  5'-UTR  to  ERG  or, 
as  in  the  case  of  HIVIGA2,  through  evasion  of  a  microRNA  by 
replacement  of  an  oncogene's  3'-UTR.  Despite  the  multitude  of 
mechanisms  used  by  chimeric  transcripts  to  drive  malignancy,  several 
important  lessons  can  be  taken  from  characterized  epithelial  gene 
fusions,  studies  of  MIL  translocations,  as  well  as  the  very  recent 
discovery  of  JAZFl -JJAZl  RNA  fusions,  which  precede  genomic 
rearrangement  in  specific  cell  types. 

As  in  the  case  of  Imatinib  and  BCR-ABLl,  perhaps  the  one  of  the 
best  methods  for  interfering  with  the  development  of  specific 
malignancies  will  be  through  inhibition  of  well-characterized, 
pathogenic  fusion  genes  with  rationally  designed  molecularly  tailored 
therapies,  in  the  future,  the  use  of  both  COPA  and  high-throughput 
massively  parallel  sequencing  will  greatly  increase  the  speed  and 
reliability  of  fusion  gene  discovery  on  both  the  genomic  and 
transcriptomic  levels.  We  expect  many  more  gene  fusions  to  be 
reported  over  the  next  several  years  in  various  tumor  types,  many  of 
which  will  hopefully  sei've  as  rational  drug  targets. 
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Abstract 


In  this  study,  we  have  investigated  the  correlation  between  RhoC  expression  and  the 
metastatic  behavior  of  head  and  neck  squamous  cell  carcinoma.  The  inhibition  of 
RhoC  expression  was  achieved  using  small  hairpin  RNA  (shRNA)  and  lentiviral 
transfection  and  transduction  technology.  Fluorescence  microscopy  of  the  RhoC 
knockdown  stable  clones  showed  a  strong  green  fluorescence  in  the  majority  of  cells, 
signifying  a  high  efficiency  of  transduction.  qRT-PCR  of  lentivirus  infected  cell  lines 
showed  a  70-80%  reduction  in  RhoC  mRNA  expression.  Furthermore,  the  mRNA 
expression  levels  of  other  members  of  the  Ras  superfamily  did  not  show  any 
significant  decrease.  Cell  motility  and  invasion  were  also  markedly  diminished  in 
RhoC  depleted  cell  lines  as  compared  to  parental  lines.  Hematoxylin  and  eosin 
staining  of  lung  tissue  obtained  from  the  lungs  of  SCID  mice  implanted  with  RhoC 
knockdown  cell  lines  showed  marked  decrease  in  lung  metastasis  and  inflammation 
of  the  blood  vessels.  When  cultured,  lung  tissue  showed  a  significant  decrease  in  cell 
growth  in  the  mice  which  were  implanted  with  RhoC  depleted  cell  lines  as  compared 
to  either  parental  or  shRNA  scrambled  sequence  control  lines.  Microscopic  studies 
of  CD31  revealed  substantial  quantitative  and  qualitative  differences  in  the  primary 
tumor  microvessel  density  as  compared  to  its  parental  and  shRNA-scrambled 
control.  This  study  is  the  first  of  its  kind  to  establish  the  involvement  of  RhoC  in 
head  and  neck  metastasis.  These  findings  suggest  that  RhoC  may  be  a  novel  target 
for  biologic  therapeutic  targeting  in  the  future. 


Introduction 

Head  and  neck  cancer  is  the  sixth  most  common  cancer  world  wide  (1).  As  per  the 
statistical  report  of  the  American  Cancer  Society,  about  40,000  new  head  and  neck 
squamous  cell  carcinoma  are  diagnosed  every  year  in  the  United  States.  Among  these, 
most  of  the  patients  are  diagnosed  at  a  very  late  stage  (stage  III  and  IV).  Despite  the 
advancement  in  surgical  procedures,  chemo  and  radiation  therapy  survival  rates  have  not 
improved  in  the  last  several  decades  (2).  Furthermore,  it  has  been  shown  that  the  high 
rate  of  morbidity  is  due  to  both  locoregional  recurrence  and  distant  metastases. 

In  the  past  decade,  numerous  studies  have  shown  that  the  Rho  family  of  GTPases, 
(RhoA,  RhoB  RhoC,  Racl,  Rac2,  Rac3  and  Cdc42)  are  involved  in  malignant 
progression  towards  metastasis.  In  fact,  several  studies  have  also  reported  elevated  RhoA 
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and  RhoC  expression  in  a  number  of  tumor  types  (3,  4).  Among  the  Rho  protein  family, 
RhoC  (moleeular-mass  of  approximately  21  kDa)  has  been  implieated  in  a  wide  range  of 
eellular  activities,  including  gene  expression,  cell  proliferation,  intracellular  signaling, 
and  cytoskeletal  organization  (5).  More  significantly,  RhoC  plays  a  central  role  in  the 
assembling  and  facilitating  actin-myosin  contractions  that  enhance  focal  adhesion, 
resulting  into  cell  polarity,  increased  cell  motility  and,  consequently,  increased 
invasiveness  {6-8).  In  addition,  signaling  mediated  by  Rho  proteins  through  Rho 
activating  kinase  (ROCK)  regulate  proteins  that  in  turn  regulate  actin  polymerization 
such  as  cofilin,  profilin  and  Formin  homology  (FH)  proteins  (P).  Interestingly,  high  levels 
of  RhoC  and  ROCK  are  also  associated  with  membrane  blebbing,  a  phenomenon  that  is 
observed  in  motile  or  invasive  cells  {9,  10). 

RhoC  over-expression  is  now  well  documented  in  a  wide  range  of  malignant 
cancers  suggesting  an  important  role  in  changing  non-invasive  carcinomas  into  invasive 
forms.  Interestingly,  over-expression  of  RhoC  has  been  reported  in  inflammatory  breast 
cancer  and  exclusively  in  invasive  breast  carcinoma  {11-14).  Other  tumor  types  where 
over-expression  of  RhoC  has  been  reported  are  ovarian  carcinoma  {15),  esophageal 
squamous  cell  carcinoma  {16),  pancreatic  cancer  {17),  gastric  cancer  {16,  18)  and  human 
melanoma  {10,  19).  In  addition,  functional  studies  have  shown  that  RhoC  can  act  as  a 
transforming  oncogene  when  it  is  over-expressed  in  human  mammary  epithelia 
converting  these  normally  immobile  cells  into  highly  motile  and  invasive  malignant  cells 
{14,  20).  Thus,  a  wide  range  of  current  studies  reveal  an  important  role  of  RhoC  in  cancer 
metastasis. 
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However,  very  few  studies  to  date  have  investigated  the  role  of  RhoC  in  head  and 
neck  cancer.  Studies  on  gene  expression  profiling  of  stage  III  and  IV  regionally 
metastatic  head  and  neck  squamous  cell  carcinoma  (HNSCC)  showed  that  there  is 
elevated  levels  of  RhoC  when  compared  to  stage  I  and  II  localized  malignancy  {21). 
Furthermore,  studies  in  our  laboratory  have  shown  that  there  is  elevated  RhoC  expression 
in  tumors  of  patients  with  HNSCC  when  compared  to  normal  squamous  cell  epithelium 
{20).  More  importantly,  our  study  showed  that  increased  RhoC  expression  is  strongly 
associated  with  lymph  node  metastasis  and  could  also  be  used  to  predict  metastasis  even 
in  small  (T1,T2)  primary  tumors  (22).  In  the  present  study,  we  investigated  the  role 
RhoC  in  head  and  neck  metastasis  by  inhibiting  its  function  using  RNA  interference 
(RNAi).  Our  in  vitro  findings  determined  that  inhibiting  RhoC  function  strongly  reduced 
cell  motility  and  invasion.  Furthermore,  we  observed  a  remarkable  reduction  in  tumor 
metastasis  and  microvessel  density  in  SCID  mice  injected  with  RhoC  knockdown  cell 
lines.  These  findings  suggest  that  inhibition  of  RhoC  function  in  head  and  neck 
squamous  cell  carcinoma  can  diminish  this  tumors’  aggressive  behavior  thus  opening 
new  possibilities  for  future  drug  therapies  targeting  this  pathway. 

Materials  and  Methods 
Cell  culture 

University  of  Michigan  squamous  cell  carcinoma  cell  lines  (UM-SCC)-llA  and 
(UM-SCC)-l  are  a  well  established  cell  lines  derived  from  a  65-year-old  patient  with  a 
T2  N2a  of  the  epiglottis  and  46-year-old  patient  with  T2N0  of  the  false  vocal  cord  (25, 
24).  These  cell  lines  were  grown  at  37°C  in  a  humidified  atmosphere  with  95%  air-5% 
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CO2.  The  cultures  were  maintained  in  Dulbecco  Modified  Eagle  Medium  (DMEM), 
(Gibco/BRL;  Gaithersburg,  MD)  containing  10%  heat- inactivated  fetal  bovine  serum 
(Hyclone;  Logan,  UT)  and  supplemented  with  SOpg/ml  of  penicillin  G,  and  50|ug/ml  of 
streptomycin  sulphate.  Sub-culturing  of  80%  to  90%  confluent  cells  was  routinely 
performed  using  trypsin-EDTA  solution  (0.05%  trypsin  and  0.53mM  EDTA).  At  harvest, 
the  cells  were  treated  with  trypsin,  washed,  concentrated  by  centrifugation,  and  counted 
with  a  hemocytometer.  The  cells  were  assessed  for  viability  by  typan  blue  exclusion  test 
(>95%)  and  then  re-suspended  to  a  final  density  of  5.0  x  10^  cells  per  ml  DMEM. 
Lentivirus  infection  and  selection  of  positive  RhoC  knockdown  clones 

RhoC  knockdown  and  scrambled  sequence  constructs  with  green  fluorescence 
protein  (GEP)  tag  and  puromycin  resistance  sites  were  synthesized  by  the  vector  core 
facility  of  the  University  of  Michigan  (www.med.umich.edu/vcore).  The  sequences  used 
for  RhoC  constructs  are  available  in  open  biosystems  (www.openbiosvstems.com)  and 
are:  oligo  ID  V2LHS_69446  and  V2LHS_69410,  accession  number  NM_001042678. 
The  sequences  of  the  constructs  are  69446  =  5'-ATACTGTCTTTGAGAACTATAT 
(Sense)  [for  RhoC  knockdown  clone  1]  and  69410  =  5'- 

CACCAGCACTTTATACACTTC  (Sense)  [for  RhoC  knockdown  clone2].  The 
sequence  of  shRNA  miR  non-silencing  (scrambled)  control: 
ATCTCGCTTGGGCGAGAGTAAGTGCTGTTGACAGTAAGC 
GATCTCGCTTGGGCGAGAGTAAGTAGTGAAGCCACAGATGTACTTACTCTCG 
CCCAGCGAGAGTGCCTACTGCCTCGGA.  This  control  sequence  does  not  match  any 
known  mammalian  genes  (the  sequence  had  at  least  3  or  more  mismatches  against  any 
gene  which  was  determined  via  nucleotide  alignment/BLAST  of  target  22mer  sequence). 
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This  is  the  non-silencing  shRNAmir  hairpin  sequence  found  in  the  pSM2,  pSMP,  pGIPZ, 
pTRIPZ  and  pLemiR  non-silencing  controls.  The  entire  process  of  lentivirus  infection 
and  selection  has  been  categorized  in  the  following  sub  headings,  (a)  CaCh  transfection 
of  lentiviral  packaging  293  FT  Cells  :  The  293FT  cell  line  was  procured  from 
Invitrogen,  Carlsbad,  CA.  Fresh  DMEM-10%  FBS  with  25 pM  chloroquine  was  added  to 
60-70  %  confluent  293FT  cells,  (which  were  previously  seeded  at  a  cell  density  of  ~5  x 
10^),  and  incubated  at  3TC  for  an  hour.  In  a  total  500pL  reaction  volume,  4pg  DNA 
constructs,  4pg  each  of  viral  packaging  vectors,  namely  Gag,  Pol,  and  Env,  and  62.5pE 
of  2M  CaCb  solution  (final  concentration  250  mM)  were  added.  This  reaction  mixture 
was  titrated  to  500pL  against  2X  HBS  buffer,  pH  7.02.  One  milliliter  of  this  solution 
contained  viral  particles  was  added  onto  the  293FT  cells.  Next,  cells  were  incubated  for 
another  12  hours  at  37°C.  Media  was  changed  after  12  hours  to  remove  chloroquine  and 
fresh  DMEM-10%  FBS  was  added  to  the  growing  293FT  cells  to  produce  virus,  (b) 
Infection  of  cell  lines  with  lentivirus:  Supernatant  of  the  293FT  cells  was  fdtered 
through  0.45pM  filter.  One  milliliter  of  this  filtered  DMEM  was  added  to  the  UM-SCC- 
llA  and  -1  growing  in  6  well  culture  plates.  Cells  were  incubated  at  37°C  and  the  GFP 
expression  was  monitored  after  48  hours  of  infection,  (c)  Selection  of  positive  colonies  of 
cells:  Pre-optimized  puromycin  (1.6  pg  /ml)  was  used  for  the  selection  of  the  lentivirus 
infected  RhoC  knockdown  clones.  These  clones  were  further  sorted  by  flow  cytometry  to 
get  maximum  number  of  GFP  positive  cells,  which  were  used  in  subsequent  studies. 

Flow  cytometry  analyses 

About  70-80%  confluent  lentivirus  infected  cells  were  harvested  using  trypsin- 
EDTA  solution  and  re-suspended  in  phosphate  buffer  saline  containing  3%  FBS,  0.5mM 
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EDTA  and  60U/ml  DNase.  Flow  cytometry  analysis  was  performed  using  a  BD  FACS 
Aria  IIU  flow  cytometer  equipped  with  a  488nm,  15mW,  air-cooled  Argon  laser. 
(Analytical  Cytometry  Faboratory,  Ohio  State  University  Comprehensive  Cancer 
Centre).  GFP  positive  cells  were  sorted  out  and  grown  for  subsequent  experiments. 
Quantitative  reverse  transcriptase  polymerase  chain  reactions  (qRT-PCR) 

Total  RNA  was  isolated  according  to  the  standard  procedure  using  TRIzol 
reagent  (Invitrogen,  Carlsbad,  CA).  Quantitative  Reverse  transcriptase  polymerase  chain 
reactions  (qRT-PCR)  were  conducted  by  Taqman  probe  system,  from  Applied  Bio 
Systems  (Foster  City,  CA  )  by  using  the  following  products  cd42:  Hs03044122_gl,  Racl; 
Hs01025984_ml,  Rac2:  Hs01032884_ml  and  RhoC:  Hs00733980_ml.  Beta  actin  and 
G3PDH  were  used  as  the  data  normalizers.  Relative  changes  in  gene  expressions  were 
calculated  using  2'^^^t  method  (25). 

Cell  Invasion  and  Motility  Assay 

Invasion  Assay:  Cell  invasion  assays  were  performed  using  BD  BioCoat  Matrigel 
Invasion  Chamber  which  was  obtained  from  BD  Biosciences,  Bedford,  MA  USA.  The 
procedure  was  followed  according  to  manufacturer  instructions.  Briefly,  about  2.5  xlO^ 
cells  in  2  ml  of  serum  free  DMEM  were  added  at  the  top  of  the  insert  and  1ml  of  media 
was  added  in  the  bottom  wells  of  each  insert.  Fetal  bovine  serum  albumin  (FBS)  was 
added  in  the  media  of  lower  chamber  (final  concentration  of  FBS  was  10%,  v/v)  which 
acts  as  a  chemo  attractant.  Cells  were  incubated  for  22  hours  in  a  humidified  cell  culture 
incubator,  at  37°C,  5%  CO2  atmosphere.  Next,  the  non-invading  cells  at  the  top  of  the 
insert  were  scraped  out  with  the  help  of  cotton-tipped  swab.  The  invading  cells  which 
were  attached  to  the  under  side  of  the  membrane  were  fixed  in  100%  methanol  and 
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stained  with  1%  Toluidine  prepared  in  100%  methanol.  After  repeated  washing  of  the 
membrane  with  distilled  water,  stained  eells  were  allowed  to  air  dry  at  room  temperature 
before  it  was  visualized  under  mieroscope.  A  parallel  experiment  with  eontrol  insert 
(without  Matrigel)  was  also  run.  Matrigel  invaded  eells  were  eounted  mieroseopieally  at 
X  100  magnifieation. 

Motility  assay:  Cell  motility  assays  were  performed  in  100mm  Petri  dishes.  At  about 
80%  eonfluenoe,  eells  were  washed  with  PBS  and  a  fine  serateh  in  the  form  of  groove 
was  made  by  the  help  of  sterile  pipette  tip  and  immediately  photographed.  We  designate 
this  time  as  the  zero  hour.  Next  eells  were  supplemented  with  DMEM  eontaining  10% 
FBS  and  allowed  to  grow.  Migration  of  eells  from  the  edge  of  the  groove  towards  the 
eentre  was  monitored  mieroseopieally  after  24  hours  to  asses  the  extent  of  seratehed  area 
eovered.  The  width  of  the  serateh  was  measured  at  zero  hour  and  after  24  hours  to 
caleulate  the  pereentage  of  the  gape  eovered  by  the  eells  in  24  hours  time. 

Animal  Model 

Athymie  severe  eombined  immune  defieient  (SCID)  miee  were  obtained  from  the 
Jackson  laboratory,  Bar  Harbor,  ME-  USA,  6  weeks  old  mice  were  housed  in  cages  of  5 
animals  in  each.  Five  animals  per  treatment  were  selected  to  receive  parental,  shRNA- 
scrambled  sequence  control  and  RhoC  knockdown  clone,  resulting  15  animals  per  cell 
lines  per  set  of  experiments.  About  half  a  million  UM-SCC-llA  and  -1  cells  were 
suspended  in  lOOpl  serum  free  DMEM  and  injected  thorough  the  tail  vein  and  or  in  the 
flank  region  into  the  mice  using  0.5  inch,  27-gauge  needle.  Animals  were  monitored 
every  other  day  for  their  general  health  and  activities.  At  the  end  of  second  weeks  the 
animals  were  euthanized  using  a  CO2  chamber.  The  lungs  were  dissected  and  half  of  the 
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lungs  were  fixed  in  buffered  formalin  for  6  hours  thereafter  transferred  to  70%  methanol 
and  then  proeessed  to  form  paraffin  embedded  tissue  bloeks  Hematoxylin  and  eosin  (H 
and  E)  staining  were  done.  Remaining  half  of  the  lungs  was  digested  in  collagenase  for 
eulturing  the  eells.  At  the  end  of  week  12,  tumors  in  flank  region  was  fully  grown.  The 
animals  were  euthanized  and  tumors  were  disseeted  and  fixed  in  the  same  way  as 
deseribed  above  for  CDS  1  staining. 

Lung  Metastases 

Slides  of  five-mierometer  seetions  of  lung  were  prepared  and  H  &  E  stained..  Eive 
random  fields  in  a  blind  fashion  way  were  examined  mieroseopieally  at  low  powers  (X40 
and  XI 00  magnifieations)  to  detect  metastases. 

Microvessel  Density 

Microvessel  density  in  all  primary  tumors  was  assessed  using  anti  mouse  CDS  1 
antibody  (PharMingen,  San  Diego,  CA)  at  a  dilution  of  1:250.  Eive  random  high  power 
fields  (X40  and  XI 00  magnifications)  were  selected  to  visualize  the  microvessels.  The 
mean  was  reported  in  a  blind  fashion  for  each  tumor. 

Statistical  Analysis 

Statistical  analyses  were  performed  using  sigma  graph  pad  prism  4  software.  The 
mean  was  reported  with  Standard  deviation  (±SD).  Differences  were  considered  to  be 
statistically  significant  when p  values  were  less  than  0.05. 

Results 

RhoC  mRNA  expression  is  greatly  reduced  in  knockdown  clones  from  head  and 
neck  squamous  cell  carcinoma  cell  lines 
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The  inhibition  of  RhoC  function  was  achieved  using  small  hairpin  RNA  (shRNA) 
and  the  lentiviral  transfection  and  transduction  technology.  Briefly,  the  cell  lines  were 
transduced  with  highly  modified  lentiviral  vectors  that  carry  small  sense  and  anti-sense 
sequences  against  the  RhoC  gene.  In  the  infected  cells,  these  sequences  are  transcribed 
into  small  hairpin  RNAs  (shRNA),  which  then  trigger  endogenous  and  highly  specific 
RNA  degradation  machinery  that  targeted  RhoC  mRNA  being  synthesized. 

To  test  the  lentiviral  infection  efficiency  and  for  the  selection  of  positive  clones 
(RhoC  knockdown)  in  the  selected  cell  lines,  the  vector  was  designed  to  express  Green 
Fluorescent  Protein  (GFP)  and  puromycin  resistance  gene.  After  lentiviral  infection, 
positive  (stable)  clones  were  selected  using  Puromycin  (1.6pg  /ml)  antibiotic.  As  shown 
in  figures  lA  and  IB  flow  cytometry  revealed  that  the  numbers  of  non-infected  cells  were 
significantly  low.  In  addition,  fluorescence  microscopy  of  the  stable  clones  showed  a 
strong  green  fluorescence  in  the  majority  of  the  cells,  signifying  a  high  efficiency  of 
transfection.  The  GFP  positive  cells  were  further  sorted  out  and  re-grown  for  subsequent 
experiments. 

We  then  tested  the  effectiveness  of  shRNA  in  depleting  RhoC  mRNA  expression 
by  real  time  quantitative  PCR  (qPCR)  in  our  selected  cell  lines.  Because  only  a  small 
number  of  specific  gene  sequences  are  capable  of  activating  the  RNA  degradation 
pathway,  we  used  two  different  RhoC  knockdown  clones  (namely  Cl  and  C2  along  with 
a  parental  and  shRNA-scrambled  sequence  infected  control)  to  ensure  the  effectiveness 
of  depleting  levels  of  RhoC.  The  results  show  greatly  reduced  expression  levels  of  RhoC 
gene  in  the  Cl  and  C2  RhoC  knockdown  clones,  while  normal  RhoC  expression  was 
observed  in  clones  with  shRNA-scrambled  sequence  (Fig.2).  The  relative  RhoC  mRNA 
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expression  in  parental,  shRNA-serambled  eontrol  and  RhoC  knoekdown  elones  1  and  2 
were  evaluated  by  quantitative  RT-PCR  and  tbe  Cj  values  thus  obtained  were  normalized 
using  two  bouse  keeping  genes  as  deseribed  in  material  and  methods  seetions.  As  shown 
in  figure  2A,  mRNA  expression  deereased  about  75  %  and  80  %  in  RhoC  knoekdown 
elone  1  and  elone  2  of  UM-SCC-  1,  respeetively,  while  a  deerease  of  40%  in  elonel  and 
70  %  in  elone  2  of  UM-SCC-llA  was  observed.  The  eontrol  shRNA-serambled 
sequenee  in  either  of  the  eell  lines  did  not  show  any  signifieant  reduetion  in  RhoC  mRNA 
expression  level.  To  eonfirm  that  only  RhoC  expression  was  being  inhibited,  the  mRNA 
levels  of  other  Rho  family  members,  Cde42,  Rael  and  Rae2  were  analyzed  by 
quantitative  RT-PCR.  As  shown  in  figures  2B,  C  and  D,  the  expression  levels  of,  Cdo42, 
Rael  and  Rao2  are  not  affeeted  signifieantly,  thus  eonfirming  that  our  shRNA  proeess  is 
highly  speeifie  to  RhoC  only.  These  studies  provided  a  elear  insight  about  the  “switehing 
off’  of  the  RhoC  maehinery  by  deereasing  total  levels  of  RhoC  mRNA  expression,  and, 
therefore,  further  detailed  studies  on  its  funetional  roles  are  defensible.  One  of  the  most 
basie  elinieal  questions  that  arise  at  this  point  is  how  knoekdown  of  the  RhoC  transeript 
affeets  metastasis  in  head  and  neek  eaneer.  To  address  this  question,  we  investigated  two 
eharaeteristie  behaviors  of  metastatie  eells,  invasion  and  motility,  in  the  transdueed  eell 
lines. 

RhoC  knockdown  clones  show  decrease  in  cell  invasion  and  motility 

Invasion  assay:  In  our  study  we  found  that  RhoC-depleted  elones  of  UMSCC- 
llA  and  -1  were  remarkably  less  invasive  and  motile  than  their  parental  or  shRNA- 
serambled  eontrols  (Figs.  3A  and  B).  Remarkably,  eell  invasion  was  deereased  by  50% 
and  75%  in  RhoC  knoekdown  elone  1,  and  2,  respeetively,  of  UM-SCC-  11  A  and  60% 
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and  80%  in  clones  1,  2  of  UM-SCC-1  as  compared  to  their  parental  or  shRNA-scrambled 
controls  (n=3;  p<  0.003). 

Motility  assay:  As  shown  in  figures  4A  and  B,  a  noticeable  decrease  in  cell 
motility  in  RhoC  knockdown  clones  clearly  indicates  its  involvement  in  preventing  the 
metastasis  in  head  and  neck  patients  (n=3;  p<  0.005).  All  these  in  vitro  results  provide 
ample  supporting  evidence  that  RhoC  plays  an  important  role  in  cell  invasion  and 
metastasis  in  head  and  neck  cancer,  while  depleted  RhoC  may  be  a  useful  tool  for 
therapeutic  measure  in  head  and  neck  carcinoma. 

RhoC  plays  an  important  role  in  lung  metastasis  and  microvessel  density  formation 

Besides  localized  tumor,  lung  metastasis  is  common  in  head  and  neck  cancer 
patients  {26).  Keeping  in  view  this  aspect  we  designed  an  in  vivo  study  where  we  can 
analyze  the  effect  of  RhoC  inhibition  in  lung  metastasis  and  in  primary  tumor  vascularity. 
In  our  in  vitro  study,  we  found  that  of  RhoC  knockdown  clones  1  and  2  in  the  cell  lines 
tested  were  gave  very  similar  results  in  our  motility  and  invasion  assays.  Therefore,  for 
further  in  vivo  studies,  we  selected  only  clone  2  from  both  cell  lines  to  test  for  metastasis. 
Three  groups  of  SCID  mice  with  five  mice  in  each  group  were  used  for  in  vivo  studies. 
Half  a  million  cells  from  parental,  shRNA-scrambled  control  and  RhoC  knockdown  clone 
2  were  injected  through  tail  vein  and  about  a  million  cells  were  implanted  in  flank  region 
for  further  studies.  Two  weeks  later  lungs  were  dissected,  half  of  the  lungs  were  stained 
with  H  and  E  and  remaining  half  were  cultured.  Figure  5A  shows  a  remarkable  difference 
in  lung  metastasis  between  parental,  shRNA-scrambled  control  and  RhoC  knockdown 
clone.  The  bar  graph  also  shows  the  number  of  cancer  cells  grown  in  digested  lung  of 
mice  which  includes  parental,  shRNA-scrambled  control  and  RhoC  knockdown  clones.  A 


12 


huge  lump  of  metastatic,  highly  inflamed  tissues  and  blood  vessels  were  found  in  lung 
region  of  the  mice  injected  with  either  parental  or  shRNA-scrambled  control  while  few 
and  very  small  patches  can  be  seen  in  the  lungs  of  the  mice  injected  with  RhoC 
knockdown  clones.  A  bar  graph  shows  a  67%  and  58%  decrease  in  cell  number  in  RhoC 
knockdown  clones  when  compared  to  their  parental  lines  (Fig.5B).  These  results  strongly 
suggest  that  inhibition  of  RhoC  expression  greatly  reduces  metastasis  in  vivo. 
Furthermore,  microvessel  density  of  the  localized  solid  primary  tumor  which  grows  into  a 
sizable  volume  after  12  weeks  of  implantation  in  the  flank  region  was  analysed  using 
CDS  1  antibody.  Microscopic  analysis  of  the  CDS  1  stained  tumor  revealed  a  remarkable 
difference  in  microvessels  formation  in  RhoC  knockdown  clones  which  were  very  small 
and  poorly  developed  as  compared  to  the  corresponding  parental  or  shRNA-scrambled 
control  (figure  6).  A  similar  pattern  of  lung  metastasis  was  also  observed  in  UM-SCC- 
1 1 A  (data  not  shown). 

Discussion 

Tumor  metastasis  is  well  correlated  with  the  over-expression  of  certain 
oncogenes.  The  over-expression  of  the  Rho  gene  family  has  been  reported  in  many 
malignant  forms  of  cancer  (27),  including  pancreatic  cancer  (77),  gastric  cancer  {16,  18) 
and  human  melanoma  {10,  19).  However,  there  have  been  very  few  studies  on  whether 
over  expression  of  RhoC  is  involved  in  head  and  neck  metastasis.  Previous  studies  in  our 
laboratory  have  shown  that  RhoC  is  actively  expressed  in  several  well  established 
University  of  Michigan  Squamous  Cell  Carcinoma  cell  lines  (UM-SCC).  Among  the  cell 
lines  tested,  the  UM-SCC-  llA  and  -1  lines  exhibited  considerably  high  levels  of  RhoC 
(22).  In  particular,  the  active  form  of  RhoC  (RhoC  GTPase)  was  observed  to  be 
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constitutively  expressed  in  the  UM-SCC  lines.  Therefore,  for  our  current  study,  we 
selected  two  UM-SCC  lines  (UMSCC-llA  and  UMSCC-1)  to  evaluate  the  role  of  RhoC 
in  head  and  neck  squamous  cell  carcinoma  metastasis.  Our  first  and  foremost  aim  was  to 
inhibit  RhoC  expression  in  the  two  selected  cell  lines  and  analyze  its  function  in  vitro. 
Our  expectation  was  that  the  motility  and  invasivion  would  be  greatly  reduced  in  RhoC 
depleted  cell  lines  as  compared  to  parental  lines.  The  data  from  the  present  study  shows 
that  the  newly  emerging  tool  of  gene  silencing  using  lentiviral  infection  is  an  efficient 
way  to  achieve  this  goal.  The  characteristic  feature  of  lentiviral  infection  is  that  it  can 
integrate  into  the  genome  of  not  only  dividing  cells  but  also  non  dividing  cells.  By  doing 
so  it  achieves  stable,  long  term  expression  of  shRNAs.  The  inhibition  of  cervical  cancer 
and  melanoma  growth  by  using  lentivirus  gene  silencing  strategies  has  been  well 
established  (28,  29).  In  this  study  we  have  demonstrated  a  successful  inhibition  of  RhoC 
gene  expression  and,  subsequently,  function  using  shRNA  techniques  (Fig.  2). 
Furthermore,  our  data  show  that  cell  invasiveness  and  motility  which  are  characteristics 
of  aggressive  head  and  neck  cancer  cell  lines  were  diminished  when  RhoC  expression 
was  inhibited  (Figs. 3  and  4).  Therefore,  these  results  suggest  that  RhoC  over-expression 
drives  cell  invasion  and  motility  in  HNSCC.  It  is  reported  that  one  of  the  major  functions 
of  Rho-family  of  proteins  is  to  control  cytoskeletal  organization  (30).  Cytoskeletal 
proteins  are  involved  pre-dominantly  in  cell  motility.  Therefore,  RhoC  may  control 
metastasis  by  modulating  cell  motility  (31).  In  order  to  facilitate  the  movement  of  cells, 
they  need  to  turn  over  both  cell-extra  cellular  matrix  and  cell-to-cell  adhesions  which 
includes  both  adherence  junctions  and  tight  junctions  (32,  33).  It  has  also  been  reported 
that  RhoC  plays  a  predominant  role  over  RhoA  in  the  weakening  of  adherence  junctions. 
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which  is  an  important  step  towards  transforming  cells  into  an  invasive  phenotype  (5). 
These  studies  therefore,  raise  the  question  as  to  what  effect  RhoC  inhibition  would  create 
in  vivo.  Our  in  vivo  results  showed  that  both  lung  inflammation  and  a  large  volume  of 
lung  metastases  were  present  in  animals  which  were  implanted  by  peritoneal  injection  of 
either  parental  or  shRNA-scrambled  sequence  (control)  cell  lines.  In  contrast,  the  lungs  of 
mice  implanted  or  injected  with  RhoC  knockdown  lines  were  free  from  any  pathological 
findings,  specifically  lung  metastases  and  inflammation  in  lung  tissues  and  blood  vessels 
(Fig. 5).  Furthermore,  the  level  of  angiogenesis  in  the  localized  tumors  was  assessed  using 
CDS  1  antibody  and  these  results  showed  a  remarkable  difference  both  in  quality  as  well 
quantity  of  the  microvessels  in  the  tumors.  The  mice  implanted  with  RhoC  knockdown 
lines  showed  markedly  fewer  and  less  poorly  developed  microvessels  as  compared  to  the 
far  greater  in  number  and  clearly  defined  vessels  in  parental  or  shRNA-control  cell  lines 
(Fig.  6). 

The  implications  of  the  findings  in  this  manuscript  may  provide  a  fertile  area  of 
research  in  head  and  neck  squamous  cell  carcinoma.  For  instance,  recent  work  has 
shown  that  matrix  metalloproteinases,  well  known  mediators  of  invasive  tumor  behavior, 
have  been  identified  as  a  specific  and  critical  player  for  the  formation  of  lung  metastasis 
{34,  35).  Li  et  al,  2006,  reported  that  the  oncogene  AFIQ  which  is  responsible  for 
primary  breast  tumor  growth  and  pulmonary  metastasis  are  at  least,  in  part,  regulated  by 
MMP  and  RhoC  expression  (36).  The  remodeling  of  the  actin  cytoskeleton  is  a  critical 
and  important  step  in  the  formation  of  pulmonary  metastasis  due  to  changes  in  cell  shape, 
polarity,  cell  interactions  and  eventual  migration  of  the  cancer  cells.  Interestingly, 
studies  by  Nelson  et.  al.  have  shown  that  the  expression  of  MMP3  gene  that  induces 
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epithelial-mesenchymal  transition  in  mammary  epithelial  cells  is  brought  about  by 
change  in  cell  shape  through  Racl  (also  a  member  of  the  Rho  family)  mediated  changes 
in  cytoskeletal  structure  (57).  Clearly,  future  studies  elucidating  the  specific  interactions 
between  the  MMP  2,  3  and  9  (major  MMP  proteins  in  head  and  neck  squamous  cell 
carcinoma)  and  RhoC  are  indicated  and  may  prove  very  informative. 

In  conclusion,  the  findings  presented  in  this  manuscript  illustrate  that  both  in  vivo 
and  in  vitro  conditions  RhoC  plays  an  important  role  in  head  and  neck  cancer  tumor 
progression  and  metastasis.  With  additional  investigations  and  ongoing  development  of 
RhoC  specific  inhibitors,  this  may  prove  to  be  an  important  therapeutic  target  in  this 
patient  population. 
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Figure  Legends 


Fig.  1  Lentivirus  infected  cell  showing  the  GFP  expression  level.  Panel  (A)  is  presenting 
UM-SCC-llA;  shRNA-scrambled  sequence  control  (SR),  RhoC  knockdown  clone  1  (cl) 
RhoC  knockdown  clone  2  (c2)  and  uninfected  cells  as  the  control  (negative).  Upper  panel 
shows  the  histograms  obtained  by  flow  cytometry.  Middle  and  lower  panel  representing 
the  GFP  labeled  cells  in  fluorescent  light  and  in  bright  light.  Panel  (B)  is  representation  of 
UM-SCC-1.  All  other  notations  are  the  same  as  described  above.  As  shown  here  a  higher 
number  of  cells  were  infected  as  evident  from  GFP  expression  patterns. 

Fig,  2  Quantitative  RT-PCR  of  UMSCC-1  and  -llA  shows  the  relative  mRNA 
expression  pattern  of  (A)  RhoC,  (B)  Cdc42,  (C)  Racl  and  (D)  Rac2  in  parental,  shRNA- 
scrambled  control  and  RhoC  knockdown  clones  1  and  2  after  infected  with  lentivirus 
using  shRNA  Strategies.  Results  were  analyzed  using  2‘^^^t  methods.  A  significant 
decrease  in  mRNA  expression  of  RhoC  knockdown  clones  were  obtained  while  the 
expression  of  cdc42,  racl  and  rac2  were  remain  unchanged.  (p<0.05). 

Fig.  3  The  extent  of  cell  invasion  through  Matrigel.  (A&E)  Parental  cell  lines  ;(  B&F) 
shRNA-scrambled  controls;  (C&G)  RhoC  knockdown  clones  1,  (D&H)  RhoC 
knockdown  clones  2  of  UM-SCC-1 1 A  and  1  respectively.  Columns  I  and  J,  rates  of 
invasion,  bars,  95%  Cl  P<  0.05. 

Fig.  4  Effect  of  RhoC  knockdown  on  cell  motility.  Panels  A  and  B  shows  the  slow 
movement  of  RhoC  knockdown  cells  as  compared  to  its  parental  or  sh-RNA  scrambled 
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control  in  UM-SCC-llA  and  1  respectively.  Bar  graphs  show  the  extent  of  pereent 
motility  (p<0.05)  referenee  point  is  the  zero  hour  values. 

Fig.  5  H  &  E  stained  slides  shows  the  mieroscopic  lung  metastasis.  Parental  (A  &  D), 
shRNA-serambled  eontrols  (B  &  E)  RhoC  knoekdown  elones  (C  &  E).  Number  of  eells 
obtained  by  eulturing  the  lungs  are  depleted  in  G  and  H  for  UMS-CC-llA  and  1 
respectively  (p<0.05). 

Fig.  6  A,  B  and  C  mierovessel  density  assessment  after  staining  with  CD  31  antibody. 
Representative  high-power  fields  from  tumor  developed  in  the  parental  (UM-SCC-1)  (A), 
shRNA-serambled  sequenee  (B)  and  RhoC  knoekdown  (C).  Microvessels  were  smaller 
and  poorly  developed  in  RhoC  knoekdown  clone. 
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